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EQUIVALENCE  PROPERTIES  OF  SEMANTIC  DATA  MODELS 
FOR  DATABASE  SYSTEMS 

by 

Sheldon  A.  Borkin 

This  report  is  hosed  upon  a thesis  of  the  same  title  submitted  to  the  Department 
of  Electrical  Engineering  and  Computer  Science  on  2 January  1979  in  partial 
fulfillment  of  the  requirements  for  the  Degree  of  Doctor  of  Philosophy. 

ABSTRACT 

A data  model  defines  the  types  of  structures  present  in  a database  and  the 
types  of  operations  which  may  be  used  to  alter  the  database.  An  understanding 
of  data  model  equivalence  properties  is  necessary  if  one  wishes  to  implement  a 
system  which  presents  different  users  with  views  of  a database  in  terms  of 
differing  data  models  or  which  provides  a common  interface  to  several  database 
systems  defined  in  terms  of  different  data  models.  Requisites  for  this 
understanding  are  formal  definitions  of  the  involved  concepts  and  a formal 
framework  in  which  different  data  models  can  be  compared. 

This  thesis  presents  formal  definitions  of  the  terms  database,  operation, 
operation  type,  application  model  and  data  model.  Database  state  equivalence, 
operation  equivalence,  application  model  equivalence  and  data  model 
equivalence  are  distinguished.  Three  types  of  application  and  data  model 

equivalence  are  defined  - isomorphic,  composed  operation  and  state  dependent. 
Implementation  implications  of  these  different  equivalences  are  discussed. 

This  definitional  framework  is  used  to  explore  the  equivalence  properties 
of  two  semantic  data  models.  Semantic  data  models  are  data  models  whose 
structures  are  meant  to  have  clear  interpretations  in  terms  of  the  applications 
being  modelled.  The  semantic  graph  and  semantic  relation  data  models  are 

formally  defined.  The  definition  of  the  semantic  relation  data  model  includes 
the  definitions  of  constraints  and  the  semantic  relational  algebra. 

It  is  proved  that  the  semantic  graph  and  the  restricted  semantic  relation 

data  models  are  state  dependent  equivalent.  Observations  on  the  network,  vs, 

relational  data  model  "controversy"  are  presented.  Suggestions  for  applications  of 
this  research  include  a dual  semantic  data  model  interface. 

Keywords:  database  systems,  data  models,  data  model  equivalence 

semantic  relation  data  model,  semantic  graph  data  model 
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Chapter  1 - Introduction 

A user  of  a database  system  always  has  some  view  of  the  structure  of  the 
data  In  the  database  and  knowledge  of  the  operations  which  can  be  used  to  alter 
the  database.  The  types  of  data  structures  visible  to  the  user  and  the  operations 
allowed  on  these  structures  are  determined  by  the  database  system's  data  model. 
The  data  model  determines  what  features  must  be  present  in  the  various 
languages  which  make  up  the  user  interface  to  the  database  system. 

The  database  system  may  Just  be  a simple  file  system  in  which  the  user 
interacts  directly  at  the  storage  device  level.  Then  the  user's  view  of  the 
database  is  that  of  a group  of  files  each  with  certain  record  formats  and 
Indexing  characteristics.  The  applicable  operations  are  the  insertion  and  deletion 
of  records.  In  such  a system,  the  data  model  would  specify  what  types  of 
record  formats  and  indexing  characteristics  are  allowed.  It  would  also  define 
the  operations.  We  will  say  that  a data  model  determines  a set  of  possible 
application  models  where  an  application  model  is  the  description  of  the 
structures  and  operations  in  a specific  system.  In  the  case  of  the  file  system, 
the  application  model  describes  a specific  file  system  where  this  application 
model  is  one  of  those  allowed  by  the  data  model  describing  the  set  of  all 
possible  file  systems. 

In  an  effort  to  simplify  the  user's  view  of  database  systems,  to  allow  for 
portability  across  different  machines  and  to  allow  for  easier  growth  of  databases, 
database  systems  have  been  designed  with  greater  degrees  of  physical  data 
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independence  [Date  and  Hopewell].  The  data  models  used  in  data  Independent 
systems  let  the  user  define  the  logical  structure  of  the  database  and  perform 
operations  without  regard  to  lower  level  implementation  details  and  the  specific 
machine  on  which  the  system  is  implemented.  Such  data  models  are  called 
logical  data  models. 

Existing  implementations  of  such  data  models  present  to  the  user  structures 
such  as  trees,  tables,  or  linked  lists  along  with  appropriate  operations.  The  most 
often  compared  data  models  ([Nijssen  74]  and  [Date  and  Codd])  are  the  relational 
data  model  of  [Codd  70,72,74]  and  the  network  data  model  of  [DBTG].  These 
data  models  are  described  in  Chapter  3. 

1.1  Motivation  for  Thesis  Research 

Different  data  models  may  seem  most  appropriate  for  different  applications. 
For  example,  one  data  model  may  be  most  appropriate  for  an  application  which 
makes  numerous  queries  but  few  updates  to  the  database,  while  another  data 
model  may  be  more  appropriate  for  an  application  which  makes  few  queries  but 
numerous  updates.  One  data  model  may  be  suitable  for  the  person  who  1s 
Intimately  familiar  with  the  details  of  the  application  area,  while  another  data 
model  may  be  easier  to  use  for  the  person  naive  to  the  application  area. 

In  recent  years,  the  number  of  proposed  data  models  has  grown  rapidly. 
The  samplings  of  data  models  in  [Kerschberg  et.  al.]  and  [Senko]  mention  over 
thirty  different  data  models. 
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Given  these  possible  preferences  for  different  data  models,  we  must  ask 
what  can  be  done  if  different  users  want  to  see  the  "same"  database  through 
the  perspective  of  application  models  based  on  different  data  models?  We  would 
want  users  to  be  able  to  make  queries  or  updates  using  either  application  model 
with  an  update  performed  via  one  model  to  be  reflected  in  subsequent  queries 
via  the  other  model.  The  ANSI/X3/SPARC  database  system  architecture  [ANSI] 
proposes  such  a capability.  This  proposal  is  further  discussed  in  Chapter  3. 

In  such  an  architecture,  it  is  desired  that  a user  of  one  application  model 
is  completely  unaware  of  the  fact  that  others  are  using  different  application 
models  possibly  based  on  different  data  models.  Clearly,  there  must  be 
constraints  on  the  application  models  so  that  updates  via  one  application  model 
can  be  represented  as  updates  in  the  others.  If  this  can  be  done,  then  we  will 
say  that  the  application  models  are  equivalent.  If  all  application  models  based 
on  one  data  model  have  equivalent  application  models  based  on  another  data 
model,  and  vice  versa,  then  we  will  say  that  the  data  models  are  equivalent. 

These  notions  of  equivalence  appear  in  other  contexts.  It  may  be  desired 
to  transfer  data  from  one  system  to  another.  In  this  case  we  must  find  out  if 
there  is  an  application  model  in  the  new  system  equivalent  to  the  current 
application  model.  If  the  data  models  on  which  the  application  models  are  based 
are  equivalent,  then  it  will  always  be  possible  to  find  equivalent  application 
models.  If  the  data  models  are  not  equivalent,  then  only  certain  application 
models  in  one  data  model  will  have  equivalents  in  the  other  data  model. 

In  the  context  of  distributed  systems,  it  might  be  desired  to  have  a central 
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interface  to  data  distributed  over  several  different  database  systems  based  on 
different  data  models.  In  this  case,  the  central  interface  application  model  must 
be  equivalent  to  the  "sum"  of  the  other  application  models. 

1.2  Concerns  of  this  Thesis 

To  clearly  understand  what  is  meant  by  "equivalence",  one  must  have  clear 
formal  definitions  of  the  relevant  concepts  and  examples  demonstrating  the  uae 
of  the  definitions.  This  thesis  presents  such  definitions  and  examples  of  their 
use. 

Prerequisites  to  formally  defining  equivalence  are  formal  definitions  of 
terms  such  as  "data  model”  and  "application  model".  Such  definitions  are 
presented  in  this  thesis  as  a formal  framework  in  which  various  data  models 
can  be  defined.  Given  this  ability  to  define  data  models,  concepts  such  as  "data 
model  equivalence",  "application  model  equivalence"  and  "operation  equivalence" 
are  defined.  These  definitions  are  based  on  the  work  in  [Borkln]. 

The  major  portion  of  this  thesis  is  concerned  with  applying  this  formal 
framework  to  the  definition  and  study  of  the  equivalence  properties  of  two  data 
models,  the  semantic  graph  data  model  and  the  semantic  relation  data  modal. 
Being  loosely  related  to  the  relational  and  network  data  models,  the  study  in 
this  thesis  should  shed  some  light  onto  the  relational  vs.  network  "controversy" 
in  addition  to  providing  an  understanding  of  the  issues  involved  in  data  model 
equivalence. 
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A major  claim  of  this  thesis  is  that  the  various  notions  of  equivalence  are 
most  meaningful  when  consideration  is  given  in  the  design  of  a data  model  to 
the  manner  in  which  the  data  model  can  represent  the  Information  of  concern 
to  the  intended  application.  If  we  are  using  a database  system  to  represent 
information  about  some  portion  of  the  real  world,  it  is  not  at  all  clear  which 
data  model  is  most  appropriate  for  the  application.  Should  a person,  for 
example,  be  represented  as  a tree  or  a table? 

Such  problems  have  led  to  the  investigation  of  semantic  data  models. 
These  are  logical  data  models  in  which  the  structures  and  operations  permitted 
are  explicitly  meant  to  represent  certain  types  of  real-world  information.  We 
will  use  the  term  syntactic  data  models  for  those  data  models  which  are  not 
semantic  data  models.  The  two  data  models  defined  in  this  thesis  are  semantic 
data  models.  If  two  semantic  data  models  are  equivalent,  then  we  are  saying 
that  they  have  the  same  "expressive  power"  in  terms  of  what  real-world 
situations  can  be  described  by  their  application  models. 

Note  that  we  do  not  claim  that  there  is  a perfectly  well-defined  boundary 
between  syntactic  and  semantic  data  models.  In  one  sense,  all  formally  defined 
data  models  are  syntactic  since  formal  systems  are  In  the  end  pure  syntax.  In 
the  sense  in  which  we  wish  to  use  the  term,  we  regard  a semantic  data  model 
as  one  which  is  conducive  to  the  user  having  a direct  interpretation  of  the 
database  in  terms  of  the  application. 

Many  workers  in  the  field  of  artificial  intelligence  have  explored  the  use 
of  semantic  data  models  which  are  appropriate  for  applications  which  are  much 
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more  complex  than  those  with  which  this  thesis  will  deal  ([Szolovlts  et.  al.]. 
[Mylopoulos  et.  al.]  and  [Simmons]).  The  intended  use  of  such  complex  data 
models  are  in  computer  systems  which  are  inferential  in  nature.  The  intended 
purpose  of  the  data  models  to  be  presented  in  this  thesis  is  not  to  create 
computer  systems  with  "greater  inferential  capabilities",  but  rather  to  allow  the 
easier  organization  and  use  of  database  systems  similar  in  capability  to  those 
currently  in  widespread  use. 

However,  many  of  the  features  which  make  the  artificial  intelligence  data 
models  semantic  In  nature  are  also  useful  In  semantic  data  models  of  more 
limited  scope.  These  features  have  been  used  In  the  data  models  to  be  presented 
ill  this  thesis.  Similar  ideas  are  behind  the  data  models  of  [Chen]  and  those 
described  in  [Nijssen  76]. 


1.3  Overview  of  Chapters 

Chapter  1 has  presented  an  Introduction  to  the  topics  discussed  In  the  rest 
of  the  thesis. 

Chapter  2 informally  presents  the  semantic  graph  and  semantic  relation 
data  models  along  with  examples  of  their  use. 

Chapter  3 discusses  the  relevant  work  of  other  researchers. 

Chapter  4 presents  a framework  for  data  model  definition. 

Chapter  5 describes  In  detail  the  semantic  graph  data  model. 

Chapter  6 describes  in  detail  the  semantic  relation  data  model. 
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Chapter  7 presents  formal  definitions  of  equivalence. 

Chapters  8 and  9 define  the  restricted  semantic  relation  data  model. 

Chapter  10  proves  the  equivalence  of  the  semantic  graph  and  restricted 
semantic  relation  data  models. 


Chapter  1 1 contains  conclusions  and  directions  for  future  research. 
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Chapter  2 - The  Semantic  Data  Models:  Examples 


For  simple  examples  of  the  data  models  to  be  used  In  this  thesis,  consider 
the  case  in  which  we  want  to  store  information  concerning  a machine  shop. 
The  following  describes  this  application: 

1.  There  are  two  types  of  entitles:  "employees"  and  "machines". 

2.  Each  employee  has  two  characteristics:  "name"  and  "age".  These 
characteristics  must  be  elements  of  the  domains  "names"  or  "years",  respectively. 
No  two  employees  have  the  same  name. 

3.  Each  machine  has  two  characteristics:  "number"  and  "type".  These 
characteristics  must  be  elements  of  the  domains  "serial-numbers"  or 
"machine-types",  respectively.  No  two  machines  have  the  same  number. 

4.  There  is  one  type  of  association : "operation".  The  operation  association  is 
defined  with  two  roles:  the  "agent"  which  must  be  an  employee  and  the 
"object"  which  must  be  a machine. 

5.  Each  machine  must  have  exactly  one  operator,  but  not  every  employee  need 
operate  a machine. 


2.1  The  Semantic  Graph  Data  Model 


The  first  data  model  we  will  use  to  represent  this  application  is  the 
semantic  graph  data  model.  We  will  define  an  application  model  as  consisting 
of  a schema  and  a set  of  operation  types.  The  schema  is  the  representation  of 
the  application  in  terms  appropriate  to  the  data  model  being  used.  The  schema 
for  the  semantic  graph  data  model  is  expressed  as  a labeled  graph  or  its 
equivalent  set  notation  form.  For  the  situation  under  consideration,  the 
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appropriate  schema  is  shown  in  Figure  2.1. 
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employee operation machine 


names  years  serial-numbers  machine-types 

Figure  2.1  - Semantic  Graph  Schema 


There  are  three  kinds  of  nodes  in  the  graph:  entity  set  names,  domain 
names  and  association  set  names.  Role  set  edges  connect  association  nodes  with 
entity  nodes  while  characteristic  set  edges  connect  entity  nodes  with  domain 
nodes.  Solid  lines  used  as  role  set  edges  signify  that  a particular  type  of  entity 
must  always  participate  in  a particular  role  in  an  association  while  a dotted  line 
used  as  a role  set  edge  means  that  the  entity  may  be  in  an  association  of  the 
specified  type  but  it  is  not  required. 

Arrowheads  represent  uniqueness  or  functionality.  For  example,  a 
functional  relationship  between  employee  names  and  employees  means  that  an 
employee  name  uniquely  Identifies  an  employee.  With  these  facts  In  mind,  the 
reader  should  compare  Figure  2.1  to  the  description  of  the  application  stated 
previously. 

The  current  state  of  the  application  consists  of  the  specification  of:  What 
are  the  names  and  ages  of  the  current  employees?  What  are  the  numbers  and 
types  of  the  machines  currently  in  the  shop?  And  who  operates  each  machine? 
This  information  is  stored  in  the  database  system  as  the  database  state.  In  the 
semantic  graph  data  model  the  database  state  is  represented  as  a graph  which 
has  one  object  for  each  object  in  the  application  state.  Figure  2.2  shows  a 
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possible  database  state  as  seen  by  a user  via  the  schema  In  Figure  2.1.  This 
shows  a situation  with  two  machines  and  three  employees.  Two  employees 
operate  machines  while  the  third  does  not. 


emp  1 oyee  <■  Sflgnt  opera t i on  <-  Qb.l6Ct  Macj,  j ne 
name^/*  'S'\^age  number^/*  N''\^type 

T.Hanhart  32  NZ745  lathe 

employee  < afl3nt  operation  <-Pfr.l8Ct  machine 
name yf  '^N^age  number^/^  '^\^ype 

C.Gershag  40  JCL161  press 


employee 

y \ 

name  / \ age 

G.Wayshum  50 


Figure  2.2  - Semantic  Graph  State 


The  second  part  of  the  application  model  is  the  set  of  allowed  operation 
types.  In  the  semantic  graph  data  model,  the  operation  types  allow  the 
insertion  and  deletion  of  independent  entities,  independent  associations  and 
semantic  units.  In  this  particular  situation,  the  operation  types  would  allow 
the  user  to: 

1.  Insert  an  employee  node  and  corresponding  characteristic  edges  checking  to 
make  sure  that  there  is  no  other  employee  with  the  same  name.  This  is  an 
instance  of  inserting  an  independent  entity. 
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?..  Insort  a machine  node  and  corresponding  characteristic  edges  checking  to 
tnake  sure  that  there  is  no  other  machine  with  the  same  number.  And,  at  the 
same  time,  insert  an  operation  association  node,  an  object  role  edge  connecting 
the  machine  and  operation  nodes,  and  an  agent  role  edge  connecting  the 
operation  node  to  some  already  existing  employee  node.  These  actions  must  be 
done  together  because  the  schema  states  that  no  machine  can  exist  without  an 
operator.  So  if  we  insert  a machine,  we  must  also  insert  an  operation 
association.  This  is  an  instance  of  inserting  a semantic  unit. 

3.  Delete  an  employee  node  and  its  characteristic  edges,  so  long  as  the 
employee  node  is  not  connected  to  a role  edge.  This  is  an  instance  of  deleting 
an  independent  entity. 

4.  Delete,  all  at  the  same  time,  an  operation  node,  the  agent  edge  connected  to 
it,  the  object  edge  connected  to  it,  the  machine  node  which  is  in  the  object  role 
and  the  characteristic  edges  of  the  machine.  All  of  the  above  must  be  deleted 
as  a unit  for  the  same  reason  they  must  be  inserted  together.  This  is  an 
instance  of  deleting  a semantic  unit. 

Application  models  for  different  applications  might  allow  the  independent 
insertion  and  deletion  of  association  nodes  along  with  their  role  edges.  Also 
note  that  association  nodes  may  also  have  characteristic  edges.  While  it  is 
possible  for  entitles  to  exist  while  not  connected  to  any  association  node,  it  is 
necessary  for  appropriate  role  entitles  to  exist  before  an  association  node  and 
role  edges  can  be  inserted. 

The  fundamental  idea  behind  this  data  model  is  that  for  every  type  of 
object  in  the  application  there  is  a corresponding  object  in  the  schema.  For 
every  unique  object  in  the  application  state  there  is  a corresponding  object  in 
the  database  state.  "Object”  is  used  not  Just  to  mean  "physical  objects"  which 
would  be  represented  as  entity  nodes,  but  also  "abstract  objects"  such  as  ages 
represented  as  characteristic  nodes,  "events"  represented  as  association  nodes,  and 
the  various  types  of  associations  between  these  objects  represented  as  edges. 
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■ The  semantic  graph  data  model  is  similar  in  some  respects  to  the  data 
models  presented  by  [Chen]  and  [Schmid  and  Swenson].  Chapter  5 will  present 
a complete  formal  description  of  the  semantic  graph  data  model. 

2.2  The  Semantic  Relation  Data  Model 

The  second  data  model  we  will  consider  in  detail  is  the  semantic  relation 
data  model.  While  the  basic  idea  behind  the  semantic  graph  model  is  that  there 
is  one  object  in  the  database  state  for  each  object  in  the  application  state,  the 
idea  behind  the  semantic  relation  data  model  is  that  the  database  state  consists 
of  sets  of  statements  describing  the  current  state  of  the  application.  The 

I 

statements  are  organized  into  relations  where  a relation  represents  the  set  of  all 
true  statements  fitting  a certain  form.  Figure  2.3  is  a relation  describing  a 
portion  of  the  hypothetical  machine  shop  world  state.  Each  row  in  the  relation 

represents  a statement  of  the  form:  " is  the  name  of  an  employee  who 

operates  a machine  of  type  with  number  ."  So  the  row  with 

elements  'T.Manhart",  "NZ745"  and  "lathe"  represents  the  sentence:  "T.Manhart  is 
the  name  of  an  employee  who  operates  a machine  of  type  lathe  with  number 
NZ745." 

The  ordering  of  the  rows  and  columns  is  not  significant.  No  statement 
may  be  duplicated  in  the  same  relation.  That  is,  no  two  rows  may  be  identical. 
Note  that  this  differs  from  the  usual  mathematical  notion  of  relations  in  whlo|l 
the  ordering  of  columns  is  significant.  We  can  allow  this  departure  because  a 
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*  predicatexase  pair 

« entity  type 

*  characteristic 

< domain 


Figure  2.3  - Semantic  Relation 


unique  heading  is  specified  for  each  column. 

The  design  of  the  semantic  relation  data  model  (as  well  as  that  of  the 
semantic  graph  model)  is  influenced  by  the  notion  of  case  grammars  ([Fillmore] 
and  [Bruce]).  This  states  that  the  "underlying"  meaning  of  natural  language 
sentences  can  be  expressed  as  a verb  phrase  (a  predicate)  plus  several  noun 
phrases  - one  for  each  case.  A predicate  corresponds  to  the  verb  phrase  which 
describes  the  type  of  event  represented  by  an  association  name  in  the  semantic 
graph  model.  That  is,  we  will  use  the  predicate  "operate"  in  the  semantic 
relation  data  model  to  describe  in  a statement  the  "operation"  association  in  the 
semantic  graph  data  model.  The  role  an  entity  plays  in  an  association  is  thus 
specified  in  the  semantic  relation  model  by  a predicate  and  a case  such  as 
"operate:agent".  We  will  call  such  a specification  a predicatescase  pair. 


operate -.agent 

operate :object 

employee 

machine 

name 

number 

type 

names 

serial-numbers 

machine-types 

T.Nanhart 

NZ745 

lathe 

C.Gershag 

JCL181 

press 

Each  case  in  the  semantic  relation  model  corresponds  to  a role  in  the 
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semantic  graph  model.  There  is  no  general  agreement  about  what  constitutes 
the  "best”  set  of  cases  (and  hence  roles)  or  which  cases  (roles)  are  necessary  for 
each  predicate  (association).  Many  authors  have  proposed  their  own  guidelines 
for  this  problem  ([Fillmore],  [Martin]  and  [Schank]  for  example).  In  general, 
the  most  useful  set  of  cases  (roles)  must  be  determined  by  the  needs  of  the 
application  area.  Its  determination  is  considered  part  of  specifying  the 
application  model. 

Note  that  each  column  in  the  example  relation  of  Figure  2.3  has  associated 
with  it  four  items:  a specification  of  a predicate: case  pair,  the  type  of  entity 
being  described,  a characteristic  name,  and  a domain  name  specifying  the  set  of 
values  from  which  each  element  in  the  column  must  be  drawn.  For  display 
purposes  we  may  "factor"  the  role  and  entity  type  specification  when  several 
characteristics  describe  the  same  entity.  This  is  done  in  Figure  2.3  for  the 
machine  entity. 


Figure  2.3  does  not  contain  all  of  the  information  shown  in  the  semantic 
graph  state  of  Figure  2.2.  It  is  missing  the  fact  that  G.Wayshum  is  an 
employee  and  also  the  ages  of  all  of  the  employees.  This  is  remedied  by  the 
addition  of  the  relation  shown  in  Figure  2.4.  The  first  row  in  this  relation  can 
be  Interpreted  as  meaning:  "T.Manhart  is  the  name  and  32  is  the  age  of  an 
employee." 

The  predicate:case  pair  in  this  relation,  "be  employee:obJect",  deserves  some 
explanation.  The  objects  of  interest  in  the  application  can  be  either  associations 


or  entities.  The  existence  of  an  association  is  represented  by  a row  in  a 


Figure  2.4  - Semantic  Relation 


relation  with  the  association's  corresponding  predicate.  This  will  also  assert  the 
existence  of  the  entities  which  have  roles  in  the  association.  However,  entities 
may  exist  which  are  in  no  associations.  Therefore  the  predicate:case  pair  "be 
em«y-type:obJect”  is  used  to  state  the  existence  of  entities. 

The  example  relations  given  so  far  give  the  impression  that  each  relation 
deals  with  only  one  predicate.  This  need  not  be  the  case.  We  will  require 
that  for  a complete  representation  of  an  application,  there  must  be  relations 
containing  the  "be”  predicate  for  each  entity  type.  This  guarantees  that  the 
existence  of  every  entity  will  be  stated  whether  it  participates  in  an  association 
or  not. 

This  can  be  accomplished  in  the  current  example  by  modifying  the 
relations  of  Figures  2.3  and  2.4  into  those  of  Figure  2.5.  For  reference 
purposes,  we  have  named  the  relations  in  Figure  2.5.  (Relation  names  will 
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Employees 


be  employee: object 

employee 

name 

age 
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T.Manhart 

32 

C.Gershag 

40 

G.Wayshum 
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operate:agent 
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T.Manhart 
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C.Gershag 

JCL181 

press 
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Figure  2.5  - Complete  Semantic  Relation  State 


always  begin  In  capital  letters.)  The  important  new  feature  is  that  the  second 
and  third  columns  of  the  Operate  relation  are  now  shown  to  be  describing  a set 
of  predicate:case  pairs.  Since  it  is  stated  that  every  machine  must  have  an 
operator,  the  columns  that  describe  the  "operate:obJectM  pair  also  describe  the 
"be  machine:object"  pair  for  machines. 

It  is  possible,  and  would  certainly  be  the  case  in  a more  complex  situation, 
that  the  set  of  predicate-.case  pairs  describing  some  column  contains  several 
predicatercase  pairs  where  none  of  these  pairs  concern  "be"  predicates.  Figure 
2.6  shows  a possible  relation  for  a more  complex  machine  shop.  This  relation 
contains  information  on  both  the  "operate"  and  the  "supervise"  predicates.  The 
first  row  in  this  relation  would  be  interpreted  to  mean:  "T.Manhart  is  the  name 
of  an  employee  who  operates  machine  number  NZ745  and  is  supervised  by  an 
employee  named  G.Wayshum." 


supervise:agent 

supervise  .-object 
operate: agent 

operate robject 

employee 

employee 

machine 

name 

name 

number 

names 

names 

serial -numbers 

G.Wayshum 

T.Hanhart 

NZ745 

G.Wayshum 

C.Gershag 

JCL 181 

Figure  2.6  - Semantic  Relation  With  Set  of  Predlcate:Case  Pairs 


In  such  a relation  with  multiple  predicates,  the  application  state  may  he 
such  that  not  every  predlcate:case  pair  in  a relation  is  applicable  to  each  row  in 
the  relation.  Consider  Figure  2.7.  This  single  relation  contains  the  same 
information  as  the  two  relations  in  Figure  2.5.  The  last  row  in  Figure  2.7  is 
Interpreted  as  saying:  "G.Wayshum  is  the  name  of  an  employee,  age  50,  who 
operates  no  machine."  We  will  refer  to  the  " — " used  to  show  the  absence  of 
any  other  value  as  a null  value. 


be  employee :object 
operate :agent 
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Figure  2.7  - Single  Semantic  Relation  For  Complete  State 


Another  essential  part  of  the  semantic  relation  data  model  is  the  idea  of 
constraints.  Constraints  specify  certain  conditions  which  must  always  be  trun 
of  the  relational  state  describing  a particular  application  state.  For  the  relations 
used  <n  Figure  2.5,  we  would  require  at  least  the  following  constraints: 
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1.  The  names  in  the  first. column  of  -relation  Operate  must  be  a subset  of  the 
flames  in  the  first  column  of  relation  Employees. 

2.  The  first  column  of  Operate  may  have  no  null  values  since  every  machine 
must  have  an  operator. 

3.  A specific  serial-number  may  occur  only  once  in  -the  second  column  of 
Operate  since  each  machine  may  have  no  more  than  one  operator. 

4.  The  names  in  the  first  column  of  Employee  must  be  unique.  This  is 
required  because  we  stated  that  no  two  employees  may  have  the  same  name. 

Constraints  serve  two  purposes.  Firstly,  they  provide  Information  about 
the  meaning  of  a row  in  a relation.  This  is  particularly  true  of  constraints, 
such  as  constraint  4 above,  which  specify  the  uniqueness  of  certain 
characteristics  in  relations  with  the  "be”  predicate.  This  type  of  constraint 
enables  a relational  statement  to  refer  to  a specific  entity  or  association  in  the 
application  state.  For  example,  we  stated  earlier  that  the  first  row  of  relation 
Employees  in  Figure  2.5  means:  "T.Manhart  is  the  name  and  32  is  the  age  of  an 
employee.”  However,  nothing  in  this  sentence  rules  out  the  possibility  of  there 
being  two  such  employees.  With  the  addition  of  constraint  4,  we  can  say  that 
the  correct  statement  represented  by  the  row  is:  "T.Manhart  is  the  name  and  32 
is  the  age  of  a unique  employee.” 

The  second  purpose  of  the  constraints  is  in  relation  to  operations  on  the 
database  state.  The  operations  allowed  in  the  relational  data  model  are  the 
Insertion  or  deletion  of  sets  of  relation  rows.  This  is  to  be  thought  of  as 

i Inserting  or  deleting  sets  of  statements  concerning  the  application  state.  But  an 

update  operation  is  allowed  only  if  all  of  the  stated  constraints  are  true  of  the 
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We  consider  the  schema  of  a particular  semantic  relation  database  model  to 
consist  of  the  relation  definitions  and  the  set  of  constraints.  The  relation 
definitions  can  be  thought  of  informally  as  the  headings  of  the  relations.  The 
database  state  consists  of  the  rows  in  the  relations. 

The  semantic  relation  data  model  is  similar  in  some  respects  to  the 
relational  data  model  of  [Codd  70,71, 72a, 72b,74].  However,  Codd's  data  model  is 
a simple  syntactic  data  model.  Others  have  proposed  similar  data  models  with 
some  semantic  features  ([Chen]  and  [Hall  et.  al.]),  but  not  in  the  depth  or 
completeness  of  the  one  presented  in  this  thesis.  The  semantic  relation  data 
model  will  be  formally  defined  in  Chapter  6. 

2.3  Equivalence  Properties 

While  the  examples  in  this  chapter  are  quite  simple,  they  are  sufficient  to 
introduce  the  kinds  of  equivalence  properties  with  which  this  thesis  will  deal. 
We  claim  that  Figure  2.5  and  Figure  2.2  are  state  equivalent.  That  is,  they 
represent  identical  states  of  the  machine  shop  application.  We  also  claim  that 
the  application  model  represented  by  Figure  2.1  (with  appropriate  operation 
types)  is  application  model  equivalent  to  the  application  model  represented  by 
the  relational  headings  of  Figure  2.5  (with  appropriate  constraints  and  operation 
types). 

As  will  be  seen  in  Chapter  7,  the  claim  of  application  model  equivalence  is 


dependent  upon  the  ability  to  derive  eaulvalent  o Derations  For  examnle.  <lvan 


- 28  - 


an  operation  inserting  an  entity  into  Figure  2.2,  we  must  be  able  to  derive  an 

. 

operation  inserting  statements  into  the  relations  of  Figure  2.5  such  that  the 
resulting  states  are  also  state  equivalent. 

However,  we  will  also  see  that  the  semantic  relation  and  semantic  graph 
data  models  are  not  data  model  equivalent.  There  are  applications  representable 
in  the  semantic  relation  data  model  which  are  not  representable  in  the  semantic 
graph  data  model. 
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Chapter  3 - Relevant  Work  of  Others 

We  will  present  the  relevant  work  of  other  researchers  by  first  discussing 
the  most  influential  data  models.  Then  we  will  see  how  semantic  concepts 
were  introduced  into  data  models.  Lastly  we  will  consider  other  work  relating 
to  equivalence  properties.  For  more  complete  surveys  of  various  data  models  see 
[Kerschberg  et.  al.]  or  [SenKo], 

3.1  The  Codasyl  DBTG  Data  Model 

We  will  discuss  the  Codasyl  DBTG  data  model  [DBTG]  as  typical  of 
syntactic  graph  data  models.  This  data  model  is  more  generally  referred  to  as  a 
network  data  model.  The  basic  methods  of  structuring  data  in  this  model  are 
by*  records  and  "sets".  Different  record  types  are  declared  with  a specification 
of  the  data  items  in  each  record  type  and  what  kind  or  values  are  allowed  for 
each  item  (e.g.  a Cobol  or  PL/ 1 -like  "picture”  clause). 

The  set  concept  is  not  the  same  as  the  mathematical  notion  of  set.  Rather 
a set  specifies  "links"  between  different  record  occurrences  in  the  database  state. 
Specifically,  a set  type  consists  of  an  owner  record  type  and  a member  record 
type.  A set  occurrence  in  the  database  state  consists  of  one  occurrence  of  a 
record  of  owner  type  and  any  number  of  occurrences  of  records  of  member 
type. 

Figure  3.1  shows  a partial  specification  of  one  possible  DBTG  schema 
corresponding  to  the  semantic  graph  schema  shown  in  Figure  2.1.  This  diagram 


Is  a variant  of  a data  structure  diagram  [Bachman  69],  It  specifies  three 
record  types  and  two  set  types,  Emp-op  and  Mach-op.  The  Operation  record 
type  is  a member  of  both  set  types  while  the  Employee  record  type  is  the 
owner  of  the  Emp-op  set  and  the  Machine  record  type  Is  the  owner  of  the 
Mach-op  set  type.  Note  that  the  Operation  record  type  does  not  include  any 
data  items.  Such  a record  is  called  a "linking  record". 


Operation 


name 

age 

number 

type  | 

Employee  Machine 


Figure  3.1  - DBTG  Schema 
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The  DBTG  database  state  corresponding  to  the  semantic  graph  state  in 
Figure  2.2  is  shown  in  Figure  3.2.  Set  membership  is  shown  by  drawing  a 
directed  cycle  of  links  connecting  the  owner  record  of  each  set  occurrence  with 
each  member  record  of  the  set  occurrence.  The  set  type  is  specified  within  the 
cycle. 


A portion  of  a more  complex  schema  is  shown  in  Figure  3.3.  This  schema 
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Machine  Machine 


Employee 

Figure  3.2  - DBTG  State 

includes  a record  type  schema  Supervision  which  is  meant  to  show,  via  its  set 
membership,  the  supervision  relationships  between  employees.  Figure  3.4  shows 
a database  state  corresponding  to  this  schema.  Note  that  two  supervision  records 
are  owned  by  the  Employee  record  with  "G.Wayshum"  as  the  name  value. 

The  basic  operations  used  to  change  the  database  state  are:  store  a record 
occurrence,  delete  a record  occurrence,  insert  a record  into  a set  occurrence. 
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Supervision  Operation 


Figure  3.3  - DBTG  Schema 


remove  a record  from  a set  occurrence,  and  modify  the  values  of  Items  In  a 
record. 

Various  integrity  constraints  can  also  he  specified  by  declaring  In  the 
schema  different  membership  classes  and  by  specifying  data  items  In  record 
types  as  either  being  unique  within  a set  occurrence  or  unique  within  all 
record  occurrences  of  the  record  type.  The  membership  classes  specify  whether 
or  not  a record  occurrence  of  a given  type  is  automatically  inserted  into  a set 
occurrence  of  a given  type  and  whether  or  not  it  is  mandatory  that  once  a 
record  occurrence  is  Inserted  into  a set  occurrence  of  a given  type,  the  record 
must  always  be  in  a set  occurrence  of  that  type.  For  example,  we  would 
declare  the  Machine  record  type  in  Figure  3.1  as  an  automatic  mandatory 
member  of  the  Mach -op  set  type  stating  that  every  machine  must  always  have 
an  operator.  Note  that  if  the  owner  record  of  a set  occurrence  is  deleted,  then 
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Figure  3.4  - DBTG  State 
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all  mandatory  members  of  the  set  occurrence  are  also  deleted. 

There  has  been  much  criticism  of  the  DBTG  data  model  ([Engles]  and  [Date 
and  Codd]).  While  we  have  presented  a simple  overview  of  the  data  model,  the 
full  specification  of  the  data  model  is  extremely  complex  and  contains  many 
restrictions  on  the  allowed  form  of  the  schema.  Many  of  these  restrictions 
seem  to  be  imposed  to  allow  an  easier  implementation  of  a database  system  based 
on  the  data  model.  So,  many  of  the  goals  of  data  independence  seem  to  be 
compromised.  Furthermore,  it  is  not  clear  how  many  of  the  features  of  the 
data  model  should  be  used  to  model  the  abstract  world  being  described  in  the 
database.  [Codasyl]  presents  an  updated  specification  for  this  data  model  which 
attempts  to  counter  some  of  these  criticisms. 

In  addition  to  showing  logical  relationships  between  record  occurrences,  the 
links  showing  set  membership  in  the  DBtG  model  are  meant  to  represent  access 
paths  for  the  retrieval  language  used  to  access  the  database.  [Bubenko  et.  al.] 
show  how  to  define  different  DBTG  schemas  for  the  same  abstract  world 
according  to  the  types  of  queries  anticipated  for  the  database.  Compare  this  to 
the  semantic  graph  data  model  introduced  in  Chapter  2 in  which  there  is  a 
single  schema  corresponding  to  a single  abstract  world. 

A related  criticism  is  that  the  DBTG  report  proposes  a retrieval  and  update 
language  based  on  "one-at-a-time"  access.  In  such  a language  a user  "navigates" 
[Bachman  73]  through  the  database  state  from  record  to  record  following  the 
links  with  the  help  of  a procedural  host  language.  However  it  should  be  noted 


that  a network  structure  need  not  necessarily  be  accessed  with  a "one-at-a-time' 
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language.  [McGee  75]  and  [Deheneffe  and  Hennebert]  present  proposals  for 
higher-level  set  oriented  languages  appropriate  for  a network  structure.  The 
basic  problem  with  these  higher-level  languages  is  that  of  finding  an  efficient 
implementation.  [Bachman  and  Daya]  have  recently  proposed  an  extension  of 
the  DBTG  model  which  contains  notions  of  modelling  semantics. 

If  links  in  the  schema  are  restricted  so  as  to  require  that  the  database  state 
is  a set  of  trees  of  record  occurrences,  then  this  is  said  to  be  a hierarchical 
data  model.  The  most  widely  used  database  system,  IMS  [McGee  77],  is  based 
on  such  a data  model.  Both  "one-at-a-time"  and  higher-level  languages  exist 
for  IMS  [Fehder]. 

3.2  The  Beiational  Data  Model 

To  eliminate  from  the  user's  view  the  access  dependencies  present  in  the 
network  and  hierarchical  data  models,  [Codd  70]  proposed  the  relational  data 
model  (often  referred  to  in  this  thesis  as  the  syntactic  relational  model).  In 
this  data  model,  the  database  state  consists  of  a set  of  named  relations.  Each 
relation  is  a subset  of  the  Cartesian  product  space  (0^  » O2  x , . x 0n)  where 
the  O j are  sets  of  values  called  domains. 

A relation  is  then  a set  of  n-tuples,  or  tuples.  The  relations  are  displayed 
in  tabular  form  where  the  orderings  of  rows  and  columns  are  insignificant.  A 
column  is  identified  by  the  domain  from  which  its  values  are  drawn. 

Figure  3.5  shows  relations  corresponding  to  the  semantic  relations  of 
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Figure  2.5.  It  can  be  seen  that  in  this  particular  case,  the  removal  of  the 
semantic  information  in  the  top  three  rows  of  the  semantic  relations'  headings 
results  in  the  syntactic  relation  of  Figure  3.5.  A syntactic  relation 
corresponding  to  the  semantic  relation  of  Figure  2.6  is  shown  in  Figure  3.6.  To 
distinguish  between  two  columns  defined  on  the  same  domain,  a role  qualifier 
is  appended  to  the  domain  name  as  shown  in  Figure  3.6  by  "oper.names"  and 
"super. names".  A further  generalization  of  the  data  model  is  defined  in  [Codd 
72a]  in  which  all  columns  have  attribute  names  in  addition  to  a domain 
specification. 
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Figure  3.6  - Relation 


The  retrieval  languages  used  with  the  relational  data  model  are  generally 
set  oriented.  We  can  distinguish  three  types  of  relational  set-oriented  languages 
- relational  calculus  [Codd  71],  relational  algebra  [Codd  72b]  and  mapping 
[Chamberlin  and  Boyce],  The  relational  calculus  retrieves  tuples  which  satisfy  a 
predicate  calculus-like  qualification.  The  relational  algebra  retrieves  tuples 
which  are  the  result  of  a sequence  of  relational  operations  performed  on  the 
database  relations.  Mapping  languages  specify  the  tuples  to  be  retrieved  by 
taking  subsets  of  chosen  relations  and  compositions  of  relations. 

However,  it  should  be  noted  that  several  relational  systems  are  based  on 
lower-level  one-tuple-at-a-time  retrieval  ([Astrahan  et.  al.],  [Stonebraker  et.  al.] 
and  [Schmid  and  Bernstein]).  [Astrahan  et.  al.]  actually  allow  the  user  to  access 
the  relations  via  such  a tuple-at-a-time  Interface.  As  in  the  case  of  the 
network  data  model,  both  "high-lever  and  "one-at-a-tlme"  interfaces  are 
possible. 

The  update  operations  allowed  in  the  relational  model  are  the  insertion  and 


deletion  of  sets  of  tuples.  These  operations  must  be  in  accordance  with  the 
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maintenance  of  some  subset  of  each  relation's  columns  as  the  key  of  the 
relation.  That  is,  a tuple  can  be  uniquely  identified  by  the  values  in  these 
columns. 

[Codd  72a]  proposes  "normal  forms"  for  relations  to  prevent  "update 

anomalies".  These  anomalies  correspond  to  database  states  which  do  not 
represent  possible  states  of  the  abstract  world  being  modelled.  The  definition  of 
the  normal  forms  depends  on  the  analysis  of  functional  dependencies. 

The  concept  of  functional  dependencies  is  understood  by  noting  that  the 

definition  of  the  key  of  a relation  specifies  a functional  dependency  between 

the  attributes  which  make  up  the  key  and  the  remaining  attributes  in  the 
relation.  Functional  dependencies  specify  certain  types  of  semantic  features  of 
the  abstract  world.  However,  as  pointed  out  in  [Bernstein],  they  are  not 

sufficient  to  capture  all  of  the  semantic  information  necessary  to  maintain  the 
database  in  a consistent  state. 

[Codd  70]  also  noted  that  other  forms  of  integrity  constraints  might  be 
desirable.  An  update  to  the  database  would  be  allowed  only  if  all  constraints 
were  "satisfied"  by  the  resulting  database  state.  [Hammer  and  McLeod]  and 
[Stonebraker]  have  investigated  such  constraints. 
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3.3  Semantic  Networks 

Semantic  networks  were  introduced  as  a method  of  representing  Knowledge 
in  artificial  intelligence  research  ([Quillian],  [Simmons],  [Szolovits  et.  al.]  and 
[Mylopoulos  et.  al.]).  [Schmid  and  Swenson]  use  a simple  semantic  network, 
data  model  to  explain  what  the  definition  of  normal  form  relations  [Codd  72a] 
means  in  terms  of  the  abstract  world  being  modelled.  They  also  show  how 
constraints  other  than  functional  dependencies  are  necessary  to  maintain 
semantic  integrity. 

A schema  in  this  data  model  is  shown  in  Figure  3.7.  This  figure  defines 
the  characteristic  object  types  names,  years,  serial-numbers  and  machine-types, 
the  independent  object  types  employee  and  machine,  and  the  association 
operate.  A database  state  is  shown  in  Figure  3.8.  It  can  be  seen  that  these 
figures  are  quite  similar  to  the  semantic  graph  schema  and  state  shown  in 
Figures  2.1  and  2.2. 
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Figure  3.7  - Semantic  Network  Schema 


The  operations  allowed  in  this  data  model  are  the  Insertion  of  Independent 
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Figure  3.8  - Semantic  Network  State 


object  types  along  with  their  associated  characteristic  objects,  the  insertion  of 
associations  between  two  existing  independent  object  types,  the  deletion  of  an 
association,  and  the  deletion  of  an  independent  object  so  long  as  it  is  not  in 
any  associations.  The  data  model  includes  the  concept  of  key  characteristics  to 
identify  objects. 

While  [Schmid  and  Swenson]  do  not  completely  specify  their  semantic  data 
model,  the  semantic  graph  data  model  used  in  this  thesis  most  closely  resembles 
this  data  model.  Another  similar  data  model  is  presented  in  [Deheneffe  et.  al.]. 
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3.4  The  Entity-Relationship  Model 

[Chen]  proposes  a "unified"  data  model  which  has  some  features  of  a 
semantic  network  data  model,  yet  uses  a relational  table  to  display  the  database 
state.  The  data  model  schema  is  partially  specified  by  a diagram  as  in  Figure 
3.9. 


Figure  3.9  - Entity-Relationship  Diagram 

Figure  3.9  specifies  two  entity  sets.  Employee  and  Machine,  which  take  on 
roles  Operator  and  Machine  in  the  relationship  Emp-mach.  This  figure 
specifies  a "1:N"  relationship  - i.e.  a machine  can  only  be  in  one  Emp-mach 
relationship  while  an  employee  can  be  in  any  number  of  Emp-mach 
relationships.  The  double  box  denotes  that  the  Machine  entity  set  is  a weak 
entity  set  and  every  machine  must  be  in  an  Emp-mach  relationship. 

A database  state  is  displayed  in  tabular  form  as  in  Figure  3.10.  One 
relation  is  defined  for  each  entity  set  with  each  tuple  corresponding  to  an 
entity  in  the  abstract  world  state.  These  entity  relations  define  values  for  each 
attribute  of  an  entity.  A value  set  for  each  attribute  specifies  the  possible 
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values  of  the  attribute.  Some  group  of  an  entity's  attributes  are  specified  as  the 
primary  key  of  the  entity  relation.  The  values  of  the  primary  hey  must 
uniquely  specify  the  entity. 
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Figure  3.10  - Entity-Relationship  Relations 


For  each  relationship  there  is  a relationship  relation,  such  as  Emp-mach  in 
Figure  3.10.  A relationship  relation  specifies  the  primary  key  values  for  each 
participating  entity  in  a relationship.  Each  tuple  corresponds  to  a single 
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relationship  in  the  abstract  world  state. 

The  operations  in  this  data  model,  though  not  fully  specified,  are  similar 
to  operations  in  the  semantic  graph  data  model.  Inserting  an  entity  or  a 
relationship  corresponds  to  inserting  a tuple  in  the  appropriate  entity  relation  or 
relationship  relation.  Deleting  a relationship  relation  tuple  causes  automatic 
deletion  of  weak  entities  in  the  relationship.  All  operations  must  be  in 
accordance  with  the  stated  relationship  functionalities  and  entity  keys. 

Since  each  relational  tuple  corresponds  to  a single  object  in  the  abstract 
world  state,  this  data  model  corresponds  much  more  closely  to  the  semantic 
graph  model  than  the  semantic  relation  data  model  of  this  thesis.  In  contrast  to 
the  entity-relationship  data  model  which  has  very  strict  requirements  on  the 
allowed  relations  (i.e.  one  per  entity  set  and  one  per  relationship  set),  the 
■ semantic  relation  data  model  allows  a large  variety  of  relations  to  be  defined  for 
a single  abstract  world  model. 

3.5  Multiple  Data  Model  Database  System  Architecture 

The  development  of  many  different  data  models  has  both  motivated  and 
been  motivated  by  the  [ANSI]  report  which  recommends  a database  system 
architecture  based  on  the  concept  of  several  levels  of  database  descriptions.  This 
architecture  allows  for  multiple  users  accessing  a shared  database.  The  three 
database  description  levels,  shown  in  Figure  3.11,  include  the  internal  schema, 
the  conceptual  schema  and  external  schemas. 


Figure  3.11  - ANSI  Architecture 


The  internal  schema  specifies  the  types  of  data  structures,  devices  and 
access  methods  which  constitute  the  physical  storage  aspects  of  the  database 
system.  The  conceptual  schema  specifies  the  contents  of  the  database  in  terms 
of  objects  significant  to  the  application  being  modelled  in  the  database.  There  Is 
much  discussion  over  the  exact  form  which  this  model  should  taXe. 


Suggestions,  many  of  which  we  would  classify  as  semantic  data  models,  are 
found  in  [Nijssen  76],  Other  appropriate  data  models  are  the  data  models 
defined  in  this  thesis. 


The  separation  of  the  conceptual  schema  from  the  internal  schema  results 
in  data  independence  ([Date  and  Hopewell]).  That  is,  the  details  of  the  physical 
storage  representation  can  be  altered  without  affecting  the  view  of  the  database 
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presented  to  the  user.  The  centralized  conceptual  schema  guarantees,  so  long  as 
only  "appropriate"  external  schemas  are  allowed,  that  every  user  sees  the 
database  and  performs  updates  consistent  with  the  application  being  modelled. 

Finally,  external  schemas  provide  the  database  system  user  with  a view  of 
the  database  appropriate  to  his  specific  needs  and  desired  data  model.  The 
schemas  are  to  be  compatible  with  the  particular  programming  or  query 
language  of  the  user.  The  external  schema  may  present  to  the  user  Just  a 
subset  of  the  information  described  in  the  conceptual  schema.  While  this  thesis 
is  directly  relevant  to  the  case  in  which  the  external  schema  describes  all  of 
the  data  present  in  the  conceptual  schema,  the  definitions  to  be  presented  can  be 
extended  to  handle  the  case  where  the  external  schema  describes  a subset  of 
the  conceptual  schema. 

This  architecture  requires  several  mapping  functions  - from  the  conceptual 
schema  to  the  internal  schema  and  from  the  conceptual  schema  to  each  external 
schema.  For  example,  a query  from  a user  posed  in  terms  of  an  external 
schema  must  be  translated  into  a query  in  terms  of  the  conceptual  schema, 
which  must  in  turn  be  translated  into  a query  which  actually  accesses  the 
stored  data.  Results  of  the  query  must  likewise  be  translated  from  an  internal 
representation  to  a form  appropriate  to  the  user  and  the  external  schema. 

The  problem  of  how  to  define  the  mappings  between  schema  levels  so  that 
all  users  see  "equivalent"  models  of  the  application  for  both  retrieval  and  update 
is  extremely  difficult  for  any  case  other  than  letting  the  external  schema  be  a 
simple  subset  of  the  conceptual  schema.  [Klug  and  Tslchrltzis]  provide  an 
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example  of  the  complexities  Involved. 


3.6  Previous  Work  on  Equivalence 


The  difficulty  of  defining  equivalent  database  states  for  syntactic  data 
models  can  be  seen  by  examining  previous  worK  on  equivalence  problems  for 
the  relational  and  DBTG  (or  similar)  models.  [Zimmerman]  and  [Fleck]  both 
require  that  there  be  a relational  tuple  for  each  DBTG  record  plus  a binary 
relational  tuple  for  each  DBTG  set  ownership-membership  link.  These 
restrictions  on  the  form  of  the  relational  state,  and  hence  schema,  severely  limit 
the  types  of  information  which  a user  might  desire  to  appear  together  in  a 
single  relation. 

[Kay]  allows  more  general  relations,  but  allows  updates  to  be  performed 
only  on  those  relations  whose  tuples  are  in  a 1-1  correspondence  with  the  DBTG 
records  and  links.  The  earlier  works  of  [Neuhold]  and  [McGee]  allow  much 
more  general  "equivalence"  mappings,  but  ignore  the  problems  of  performing 
equivalent  updates.  [Klug  and  Tsichritzis]  and  [Paolini  and  Pelagatti]  who  also 
allow  general  mappings  point  out  the  difficulties  of  performing  equivalent 
updates  for  arbitrary  conceptual  to  external  database  state  mappings. 

[Navathe  and  Schkolnick]  define  a data  model  to  facilitate  "view 
integration".  View  integration  is  the  process  of  developing  a single  model  of 
the  application  consistent  with  each  user's  view  of  the  application.  This  is 
similar  to  the  problem  of  ensuring  that  each  ANSI  external  schema  is  compatible 


i 


- 47  - 

with  the  conceptual  schema.  Navathe  and  Schkolnick  properly  emphasize  the 
Importance  of  stating  and  taking  into  account  the  effects  of  operations  altering 
the  database  state. 
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Chapter  4 - Framework  for  Data  Model  Definition 

This  chapter  defines  a framework  for  data  model  definition.  By  this,  we 
mean  that  we  will  identify  those  components  which  constitute  a formal  data 
model  definition.  This  framework  will  then  be  used  in  following  chapters  to 
formally  define  the  semantic  relation  and  semantic  graph  data  models.  All  of 
the  non-inferential  data  models  discussed  in  Chapter  3 can  also  be  defined  in 
this  framework. 

A basic  premise  of  what  follows  is  that  the  database  system  is  used  to 
model  some  portion  of  the  "real  world"  which  is  of  interest  to  the  system  user. 
We  will  call  this  portion  of  the  real  world  the  application.  Example 
applications  are:  a factory's  Inventory  control  system,  the  personnel  assignments 
in  a machine  shop  or  an  all-encompassing  "corporate  database".  The  application 
state  represents  a "snapshot"  of  the  application  at  a given  time. 

The  database  state  may  consist  of  objects  which  are  in  a 1-1 
correspondence  with  the  application  state  (a  common  interpretation  of  the 
network  approach)  or  it  may  consist  of  "statements"  about  the  application  state 
(an  interpretation  of  the  relational  approach).  Other  approaches  are  possible.  In 
any  case,  we  assume  that  the  database  models  some  application  which  can  be 
thought  of  as  having  a state  and  certain  allowed  transitions  between  states.  In 
defining  equivalences  in  this  thesis  we  will  be  concerned  about  different  data 
models'  capabilities  in  expressing  these  application  models,  their  states  and  their 


transitions. 
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A data  model  is  defined  as  a (finite  or  infinite)  set  of  application  models. 

data  model  = {application  model  1 .application  model2,  . . . application  modeln} 

Each  of  the  application  models  specifies  a view  of  an  application  to  the  database 
system. 

4.1  Application  Models 

An  application  model  does  not  specify  the  state  of  an  application  at  a given 
point  in  time  (with  the  exception  of  the  "initial  state").  This  is  the  purpose  of 
the  database  state.  Rather,  the  application  model  specifies  which  database  states 
are  valid  representations  of  application  states  and  which  transitions  between 
states  are  allowed. 

To  specify  the  allowed  states  and  transitions  for  each  application,  »" 
application  model  consists  of  a schema,  an  Initial  state  and  a function  which 
maps  a finite  set  of  operation  type  names  into  operation  types.  An  operation 
type  is  the  functional  definition  of  the  named  operation. 

application  model  = (schema,  initial  state, 

[operation  type  name  -»  operation  type]) 

The  schema  can  be  considered  to  be  the  "declarative”  portion  of  the 
application  model.  A basic  notion  is  that  objects  in  the  database  state  are 
instances  of  objects  in  the  database  schema.  For  example,  corresponding  to  an 
entity  set  node  in  a semantic  graph  schema  there  are  entity  nodes  in  the 
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semantic  graph  state.  Corresponding  to  a relational  heading  in  a semantic 
relation  schema  there  are  tuples  (statements)  in  the  semantic  relation  state 
which  "fit  the  form"  of  the  relational  heading. 

Every  time  a user  performs  some  operation  on  the  database,  the  schema  Is 
consulted  to  make  sure  the  operation  is  "legal”.  Consequently,  the  schema  also 
contains  constraints  concerning  the  legality  of  applying  operations  to  given 
states.  In  the  semantic  graph  schema,  the  constraints  take  the  form  of  the 
function  defining  arrowheads  and  in  the  distinction  between  solid  and  dotted 
edges.  In  the  semantic  relation  schema,  there  is  an  explicit  set  of  integrity 
constraints. 


4.2  Operation  Types 


Operation  types  are  the  "procedural"  portions  of  the  application  model. 
They  specify  what  a user  may  do  to  change  the  database  state.  In  the  semantic 
relation  data  model,  the  operation  types  are:  Insert-tuples  and  Delete-tuples  which 
are  used  to  insert  or  delete  sets  of  tuples  (statements).  Some  of  the  operation 
types  for  the  semantic  graph  data  model  are:  Insert-ind-entity  for  inserting  an 
independent  entity  and  Insert-semantic-unit  for  inserting  a semantic  unit. 

For  the  data  models  studied  in  this  thesis,  all  application  models  in  a given 
data  model  have  identical  sets  of  operation  types.  This  is  also  true  of  all  data 
models  in  common  use.  However,  it  may  be  desirable  to  have  data  models 
which  allow  user-defined  operation  types.  This  bears  relationships  to  the  notion 
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of  abstract  data  types  [Liskov  et.  al.].  The  desirability  of  having  data  models 
which  allow  user-defined  operation  types  is  discussed  in  the  concluding  chapter 
of  this  thesis. 

Formally,  each  operation  type  is  specified  as  a function 

operation  type  : (schema  x arguments  x*  database  state)  -*  database  state. 

Given  a schema,  a database  state  and  some  arguments  whose  types  depend  on  the 
specific  operation  type,  an  operation  type  defines  a new  database  state.  One 
such  possible  new  state  is  the  error  state. 

For  the  study  of  formal  equivalence  properties,  it  is  necessary  to 
distinguish  between  operation  types  and  operations.  An  operation  is  a function 
which  maps  each  possible  database  state  into  another  database  state. 

operation  : database  state  -»  database  state 

Given  a schema  and  the  set  of  possible  other  arguments  for  each  operation  type, 
we  can  generate  an  application  model's  set  of  allowable  operations.  For 
example,  given  a relational  schema,  there  would  be  an  operation  corresponding 
to  the  insertion  or  deletion  of  each  possible  set  of  tuples. 

The  initial  state  of  an  application  model  specifies  the  database  state  which 
corresponds  to  the  application  state  at  "system  Initialization"  time.  This  might, 
for  example,  be  the  "empty  state".  This  state  is  of  particular  importance  in  that 
the  closure  of  an  application  model's  set  of  allowable  operations  applied  to  this 
Initial  state  defines  the  set  of  valid  (reachable)  states  for  that  application 
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model.  When  studying  application  model  equivalence,  we  need  only  be 
concerned  with  those  states  which  are  actually  reachable. 

4.3  Databases 

Thus  far  we  have  used  the  term  database  state  without  any  further 
explanation.  The  specific  form  of  the  database  state  depends  on  the  data  model 
under  discussion.  In  the  semantic  relation  model,  the  state  is  formally  expressed 
as  a mapping  from  relation  names  to  sets  of  statements.  A semantic  graph  state 
is  formally  expressed  as  sets  of  nodes  and  edges. 

Finally,  we  can  define  a database  as  specifying  for  an  application  both  the 
current  state  and  all  possible  state  transitions. 

database  = (application  model,  database  state) 

That  is,  in  the  context  of  a database  system  there  is  a static  component,  the 

. * • 

application  model,  and  a dynamic  component,  the  database  state.  The  application 
model  determines  the  set  of  valid  operations  while  the  database  state  is  the 
database  system's  representation  of  the  application  state. 

There  are  several  aspects  of  a database  system  which  we  will  not 
explicitly  explore  in  the  framework  or  elsewhere  in  this  thesis.  However,  the 
framework  can  easily  be  extended  to  discuss  these  Issues.  These  Issues  are  the 
questions  of  application  model  modification,  concurrent  access  to  a shared 
database  and  retrieval  operations. 


At  a higher  level  there  must  be  a means  of  declaring  and  perhaps 
modifying  the  application  model.  We  will  not  explicitly  discuss  how  this  Is  to 
be  done. 

Similarly,  a higher  level  view  is  need  to  discuss  the  problems  of 
concurrent  access  to  shared  databases.  We  will  simply  assume  that  to  each  user, 
it  appears  as  though  he  were  the  only  user  of  the  system.  Alternatives  to  this 
view  are  discussed  in  [Gray  et.  al.]  and  [Lamport]. 

We  will  not  specify  retrieval  operations  and  query  languages.  Since  we 
are  assuming  the  database  system  to  be  non-inferential,  retrieval  operations  can 
be  thought  of  as  simply  displaying  some  subset  of  the  database  state. 

The  definitions  used  to  define  data  model  are  summarized  in  Figure  4.1. 


data  model  = (application  model  j .application  model2,  . . . application  modeln} 

application  model  = (schema,  initial  state, 

[operation  type  name  -♦  operation  type]) 

operation  type  : ( schema  x arguments  x database  state)  -*  database  state 

operation  : database  state  -»  database  state 

database  = (application  model,  database  state) 


Figure  4.1  - Summary  of  Terms 
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Chapter  5 - The  Semantic  Oraph  Data  Model 

The  fundamental  concepts  of  the  semantic  graph  data  model  are  that  the 
database  state  contains  an  object  corresponding  to  each  object  in  the  application 
state,  and  that  the  operations  applicable  to  the  database  state  correspond  to  the 
"elementary"  types  of  changes  which  can  occur  in  the  application  state.  By 
"elementary”  we  mean  that  these  are  the  smallest  units  of  change  possible 
which  would  leave  the  database  in  a semantically  consistent  state.  All  other 
semantically  meaningful  state  changes  can  be  described  as  a sequence  of 
elementary  changes.  We  will  describe  the  semantic  graph  data  model  in  terms 
of  the  framework  presented  in  Chapter  4. 

5.1  The  Semantic  Graph  Application  Model  Schema 

The  semantic  graph  application  model  schema  describes  the  application  by 
stating  the  general  classifications  of  the  objects  to  be  modelled.  It  is  assumed 
in  the  semantic  graph  model  that  each  object  can  be  classified,  for  example,  as  a 
characteristic,  an  entity  or  an  association.  Examples  of  these  classifications 
were  presented  in  Chapter  2.  Since  each  type  of  database  object  has  its  own  set 
of  allowed  operations,  the  formal  basis  for  deciding  what  type  of  object  should 
be  used  to  model  a real-world  object  is  according  to  what  kinds  of  state 
changes  should  be  allowed  on  the  database  representation  of  the  real-world 
object. 
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In  set  theoretic  terms,  a semantic  graph  schema  Is 

(E,  A,  D,  r,  c,  fr,  fc,  sd,  R,  C,  Str) 
where  the  components  are  defined  as  follows: 

E:  This  is  a finite  set  of  entity  set  names.  Each  element  of  this  set 

corresponds  to  a unique  entity  set  node  in  the  graph  representation  of  the 
schema.  Examples  of  elements  of  this  set,  drawn  from  the  machine  shop 
example  presented  in  Chapter  2,  are  "employee”  and  "machine". 

A:  This  is  a finite  set  of  association  set  names.  Each  element  of  this  set 
corresponds  to  a unique  association  set  node  in  the  graph  representation  of  the 
schema.  An  example  of  an  element  of  this  set  is  "operation". 

D:  This  is  a finite  set  of  domain  names.  Each  element  of  this  set 


corresponds  to  a domain  node  in  the  graph  representation.  Examples  are 
"names”,  "years",  "serial-numbers"  and  "machine-types". 

r:  This  is  the  role  set  edge  function,  r is  a function 

r:  (A  X R)  -»  E 

where  R is  the  set  of  role  names.  Role  names  are  such  things  as  "agent"  and 
"object"  which  are  derived  from  the  idea  of  cases  in  case  grammars  as 
mentioned  in  Chapter  2.  The  roles  we  use  in  examples  are  based  on  the  work, 
of  [Fillmore].  Each  role  set  edge  is  shown  in  the  graph  representation  as  a line 
connecting  an  association  set  node  and  an  entity  set  node.  The  line  is  labelled 
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with  the  role  name.  The  direction  of  the  edge  will  explained  later  in  this 
chapter,  r is  a function  because  we  require  that  a given  role  name  may  label 
only  a single  edge  terminating  on  any  particular  association  set  node.  This  is 
simply  a statement  that  a role  in  an  association  is  only  fulfilled  by  one  type  of 
entity. 


C:  This  is  the  characteristic  set  edge  function,  c is  a function 

C:  (EuA  x C)  -*  D 

where  C is  the  set  of  characteristics  such  as  "name",  "age",  "number"  and 
"type".  Each  characteristic  set  edge  is  shown  in  the  graph  representation  as  a 
line  connecting  an  entity  set  node  or  an  association  set  node  with  a domain 
node.  The  line  is  labelled  with  the  characteristic  name.  The  direction  of  the 
edge  will  be  explained  later  in  this  chapter.  A given  characteristic  name  may 
label  only  one  edge  terminating  on  any  particular  entity  set  or  association  set 
node. 


The  functions  fr  and  fc  will  specify  restrictions  on  the  values  of 
characteristics  and  on  the  combinations  of  entities  into  associations  which  may 
occur  in  the  database  state.  These  restrictions  serve  two  purposes.  Firstly,  they 
express  restrictions  which  occur  in  the  application  such  as.-  "Each  machine  may 
have  only  one  operator."  Secondly,  they  provide  means  by  which  objects  in  the 
database  state  can  be  identified.  That  is,  "A  machine  is  uniquely  identified  by 


its  serial  number.' 


- 58  - 


fr:  This  is  a function 

fr:(A  x R ) -*  ( (restr  icted.unrestr  ictedl  x (mandatory, non-mandat oryl ) . 

This  function  determines  two  features  of  role  set  edges.  A role  set  edge  may  be 
mandatory  or  non-mandatory.  An  edge  designated  mandatory  means  that  every 
entity  in  the  database  state  of  the  type  connected  to  the  edge  must  be  a 
participant  in  at  least  one  association  of  the  type  also  connected  to  the  edge.  In 
the  machine  shop  example,  the  edge  connecting  the  machine  entity  set  node 
with  the  operation  association  set  node  is  such  an  edge  because  we  require  that 
each  machine  must  be  operated.  The  feature  which  is  specified  as  restricted 
or  unrestricted  will  be  discussed  in  the  definition  of  fc  below. 

In  the  graph  representation  of  the  schema,  we  will  show  mandatory  edges 
as  solid  lines  while  non-mandatory  edges  will  be  shown  as  dotted  lines.  An  edge 
which  is  unrestr  icted  will  have  an  arrowhead  terminating,  on  the  entity  set 
node  while  a restricted  edge  will  have  an  arrowhead  terminating  on  the 
association  set  node. 

fC:  This  is  a function  fc:(EuA  x C)  -*  (key.non-keyt . It  assigns  a feature 
to  each  characteristic  set  edge.  The  significance  of  this  in  the  case  of 
characteristic  set  edges  connected  to  entity  set  nodes  is  that  in  the  database  state 
there  may  be  only  a single  entity  node  connected  to  a given  set  of  domain 
element  nodes  if  all  of  the  corresponding  characteristic  set  edges  have  the  key 
feature.  In  other  words,  if  an  entity  set  node  has  at  least  one  characteristic  set 


edge  with  the  key  feature,  then  in  the  database  state,  each  entity  of  that  type 
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can  be  uniquely  identified  by  its  key  characteristics.  An  example  is  the  "name" 
characteristic  being  key  for  the  "employee"  entity  set  and  the  "number" 
characteristic  being  key  for  the  "machine”  entity  set. 

The  significance  of  the  feature  for  association  nodes  which  have 
characteristics  is  somewhat  different  and  involves  the  restricted  or 
unrestricted  feature  specified  by  the  function  fr.  Associations  in  the  database 
state  are  uniquely  identified  by  key  characteristics  of  an  association  along  with 
the  key  characteristics  of  entities  in  restricted  roles  of  the  association. 

If  an  association  set  has  no  key  characteristics,  then  the  restricted  entities 
uniquely  Identify  the  association  in  the  database  state.  If  an  association  set  has 
no  restricted  roles,  then  the  key  characteristics  uniquely  identify  the 
association.  For  an  association  which  has  both  restricted  roles  and  key 
characteristics,  then  it  is  a combination  of  these  roles  and  characteristics  which 
can  uniquely  identify  an  association  of  this  type  in  the  database  state.  An 
example  of  an  association  set  with  key  characteristics  will  be  presented  later  in 
this  chapter. 

If  a characteristic  set  edge  has  the  key  feature,  then  an  arrowhead  is 
placed  on  the  end  of  the  edge  terminating  on  the  entity  or  association  set  node. 
If  the  feature  is  non-key,  then  an  arrowhead  is  placed  on  the  domain  node  end 
of  the  edge. 

Sd:  This  is  a function  which  provides  the  definition  of  domain  element 
representation.  Domains  in  the  application  are  sets  of  semantic  concepts  such  as 
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"the  number  1"  or  "the  color  red".  In  the  set  theoretic  formalisms  we  are 
giving  which  are  to  model  these  concepts  we  must  eventually  get  to  the  point 
where  we  agree  on  symbolic  representations  for  these  concepts.  The  function 
sd  specifies  such  a symbolic  representation  of  each  of  the  domains  named  in  the 
set  D. 

We  will  require  that  every  domain  element  can  be  expressed  as  a character 
string,  though  not  necessarily  a single  word.  We  will  also  require  that  for  a 
given  domain,  distinct  character  strings  represent  distinct  abstract  domain 
elements.  Note  that  this  is  not  true  of  natural  language.  Both  "4"  and  "IV" 
represent  the  same  abstract  number.  A domain  may  be  infinite  such  as  the 
domain  "natural  numbers"  or  a domain  may  be  finite  such  as  the  domain  "sex" 
with  elements  "male"  and  "female". 

sd  will  then  be  a function 

sd:D  -»  (Str  -♦  (true,  falsel  1 . 

The  function  maps  each  element  of  0 into  a domain  specification.  A domain 
specification  is  a function  which  maps  each  possible  character  string  into  true 
or  false  indicating  whether  or  not  the  character  string  is  a valid  representation 
of  some  domain  element.  We  will  not  further  specify  the  set  Str  of  possible 
character  strings.  A set  appropriate  to  the  application  being  modelled  is 
assumed. 

We  have  seen  above  that  the  features  specified  by  the  functions  fr  and  fc 
restrict  the  possible  combinations  in  the  database  state  of  entities  into 
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associations  and  of  characteristics  of  entities  and  associations.  This  provides  a 
means  of  identifying  entities  by  their  key  characteristics  and  of  identifying 
specific  associations  by  specifying  the  identities  of  the  entities  participating  in 
the  restricted  roles  of  the  association  and  the  key  characteristics  of  the 
association. 

We  are  now  going  to  make  the  assumption  that  each  entity  type  has  at 
least  one  key  characteristic  and  that  each  association  type  has  at  least  one  key 
characteristic  or  at  least  one  restricted  role.  This  eliminates  the  possibility  of 
having  two  entities  of  the  same  type  with  identical  characteristics.  It  also 
eliminates  the  possibility  of  having  two  associations  of  the  same  type  which 
have  both  identical  characteristics  and  entitles  participating  in  each  role  which 
have  identical  characteristics. 

The  sets  R and  C have  been  described  in  connection  with  r and  c.  Str  has 
been  described  in  connection  with  sd. 

This  concludes  the  set  definitions  of  the  semantic  graph  schema.  Figure 
5.1  shows  a more  complicated  schema  which  will  be  used  in  subsequent 
discussions.  Some  features  of  this  schema  are: 


1.  The  entity  set  machine  now  has  key  characteristics,  type  and  number,  and 
non-key  characteristics,  age  and  location. 

2.  The  operation  association  now  has  a key  characteristic,  shift,  and  a 
non-key  characteristic,  cost- per- hour.  That  is,  both  a shift  and  a machine 
identification  are  needed  to  identify  a particular  operation  association.  And  the 
cost-per-hour  to  operate  a machine  is  a characteristic  of  a particular  machine  on 
a particular  shift. 
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employee  < operation  « machine 


Figure  5.1  - More  Complex  Semantic  Graph  Schema 


3.  The  supervision  association  in  this  schema  has  both  the  agent  and  object 
roles  from  the  entity  set  employee. 

4.  The  domain  years  is  used  by  characteristics  of  both  the  employee  and  the 
machine  entity  sets. 

5.2  The  Semantic  Graph  Database  Stale 

Before  defining  the  operation  types  which  constitute  the  second  component 
of  an  application  model,  we  will  give  the  formal  definition  of  the  database  state 
on  which  the  operations  are  to  be  used.  The  database  state  is  a model  of  the 
application  state.  For  each  object  in  the  application  state,  there  is  a 
corresponding  object  in  the  semantic  graph  database  state.  Every  database  state 
is  based  on  some  schema.  For  the  following  definitions,  we  will  assume  that 
an  appropriate  schema 


(E,  A,  0,  r,  c,  fr,  fc,  sd,  R,  C,  Str) 
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has  been  defined.  A semantic  graph  database  state  is  then  represented  in  set 
theoretic  terms  as 

<£,  A,  D,  r,  c) 

where  the  components  are  defined  as  follows: 

D.  This  is  the  set  of  domain  element  nodes.  Each  element  in  this  set  is 
an  element  of  (D  * Str).  Each  node  specifies  a domain  name  and  the  character 
string  representation  of  some  element  in  the  domain. 

Example  elements  of  0 are  (names.  T.Manhart)  and  (serial-numbers,  NZ745). 
In  the  graph  representation  of  the  state  we  will  show  only  the  domain  element 
character  string  as  the  node. 

ft  This  is  the  set  of  entity  node  identifiers.  Each  element  of  this  set 
corresponds  to  an  entity  node  in  the  graph  representation  of  the  database  state. 
The  identifier  consists  of  the  values  of  those  characteristics  which  uniquely 

identify  the  node.  Each  entity  node  identifier  in  £"  is  an  element  of  the  set 
E-id  defined  as 

E-id  - (E  x (C  -<  (D  x StrlJ). 

Each  identifier  specifies  an  entity  set  name  giving  the  type  of  the  entity  and  a 
partial  function  which  maps  each  key  characteristic  of  the  entity  into  the 
domain  type  and  string  which  represents  the  domain  element  for  that 
characteristic.  An  example  would  be  (employee,  {(name,  (names,  T.Manhart)))). 
Note  that  we  use  a set  of  ordered  pairs  to  specify  a function.  This  will  be 
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done  throughout  this  thesis. 

Clearly,  not  every  element  of  E-id  would  be  a valid  identifier.  For 
example,  it  must  be  required  that  the  characteristics  defined  are  only  those 
which  are  key  for  the  entity  type  and  that  the  character  strings  are  indeed 
valid  domain  element  representations  as  defined  by  the  domain  specification  in 
the  schema.  Since  the  state  can  only  be  modified  by  the  operations  defined  in 
the  application  model,  checks  such  as  these  are  made  in  the  operations  (which 
will  be  defined  in  the  next  section  of  this  chapter). 

The  elements  of  £,  and  the  corresponding  nodes  in  the  graph,  can  be 
partitioned  into  entity  sets  according  to  the  entity  set  name  which  is  the  first 
component  of  the  identifier.  In  the  graph  representation  of  the  state,  the  entity 
set  name  is  used  as  the  node  label,  but  this  labelling  is  not  necessarily  unique. 
However,  there  will  be  a unique  node  which  has  the  label  specified  by  the 
identifier  and  has  edges  connecting  the  node  to  the  characteristic  domain 
elements  specified.  That  is,  there  are  two  “machine’'  nodes  in  Figure  2.2,  but 
they  are  distinguished  by  the  "number"  characteristic. 

A:  This  is  the  set  of  association  node  identifiers.  Each  identifier  is  an 
element  of  the  set  A- id  defined  as 

A-id  « (A  x [R  -*  E-id)  x [C  -»  (D  x Str)J). 

An  association  Identifier  must  specify  the  association  set  name,  the  entity 
identifiers  of  each  entity  in  a restricted  role,  and  the  domain  element  for  each 
key  characteristic.  As  with  entity  sets,  the  elements  of  A are  partitioned  into 


- 65 


association  sets  by  the  specified  association  set  names.  The  update  operations 
must  enforce  certain  requirements  such  as  that  of  making  sure  that  each  entity 
Identifier  has  an  entity  set  name  which  is  the  same  as  that  which  the  schema 
specifies  should  fill  this  role.  Such  constraints  are  enforced  (and  defined)  by 
the  operation  type  definitions. 

An  example  element  of  A would  be: 

(operation,  ((object,  (machine,  ((number,  ( serial-numbers,  NZ745)))))I,  (I). 

n This  is  the  set  of  role  edges.  Role  edges  connect  entity  nodes  with 
association  nodes  specifying  the  role  which  the  entity  has  in  the  association,  r 
is  a subset  of 

(A- id  x R x E- i d) . 

The  type  of  the  entity  node  for  each  role  must  be  the  same  as  that  specified  by 
the  role  set  edges  in  the  schema.  It  is  required  that  every  role  defined  for  an 
association  set  in  the  schema  has  a corresponding  role  edge  for  each  association 
in  the  database  state.  Since  each  association  may  have  only  one  entity  for  each 
role,  r will  be  expressed  as  a partial  function 

T:  (A- id  x R)  ■*  E-id. 

In  the  graph  representation  of  the  state,  these  edges  connect  the  association 
specified  to  the  entity  specified  with  a line  labelled  by  the  role  name. 

C:  This  is  the  set  of  characteristic  edges.  It  is  a subset  of 

(E-idOA-id  x C x (D  x Str)). 

Each  element  specifies  an  entity  or  association  node  and  the  domain  element 
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tiode  which  is  the  specified  characteristic.  Each  element  of  E and  A must  be 
connected  to  an  edge  corresponding  to  each  of  the  characteristic  set  edges 
specified  in  the  schema  for  that  type  of  entity  or  association.  c will  be 
expressed  as  a partial  function 

C:  (E-idUA-id  x C)  -*  (Ox  Str). 

This  completes  the  set  definitions  of  the  database  state.  To  fully  specify 
all  of  the  constraints  on  the  elements  in  these  set,  we  must  look  at  the 
operation  types  to  be  defined  in  the  next  section  which  modify  the  state. 
Figure  5.2  shows  a database  state  in  graph  notation  corresponding  to  the  schema 


of  Figure  5.1. 
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NZ745 


cost-per-hour  number  / age 


agent  | object  i 

employee  < operation  « machine 


shift  location 


B.Belglm  35  2 


1-A  lathe 


agent 


supervision 


object 


cost-per-hour 


agent  | object 

employee  « operation  < 


shift 


T.Hanhart  32 


agent 


supervision 


object 


JCL181 


shift 


number  / age 


agent  i object  i 
employee  * operation  * machine 


cost-per-hour 

location 


C.Gershag  40  75 


employee 


2-B  press 


G.Wayshum  50 


Figure  6.2  - More  Complex  Semantic  Graph  State 
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6.3  The  Semantic  Graph  Operation  Types 

The  six  semantic  graph  operation  types  allow  the  database  user  to  insert  or 
delete  from  the  database  state  independent  entities,  independent  associations  ox 
semantic  units.  A semantic  unit  is  a group  of  entities  and  associations  which 
must  be  inserted  or  deleted  as  a single  unit  due  to  mandatory  role  requirements. 
An  example  of  a semantic  unit  is  the  combination  of  a machine  entity  and  an 
operation  association  in  Figure  2.2.  The  machine  cannot  exist  without  being  in 
an  operation  association  due  to  the  mandatory  role  restriction. 

An  independent  entity  is  an  entity  whose  entity  type  is  not  a mandatory 
role  in  any  association  type.  An  independent  association  is  an  association  which 
can  be  added  (or  deleted)  to  the  database  state  without  adding  (or  deleting) 
entities  at  the  same  time  because  either  its  association  type  does  not  have  any 
mandatory  roles  or  the  type  has  mandatory  roles  but  every  entity  participating  in 
the  association's  mandatory  roles  also  is  in  a different  association  of  the  same 
type. 

We  will  give  informal  discussions  of  each  operation  type  and  provide 
examples.  In  addition,  formal  definitions  are  given  in  Appendix  1 for  each 
operation  type.  In  Appendix  1,  a notation  similar  to  a programming  language  is 
used  for  specifying  the  function  of  each  operation.  All  of  the  formal 
definitions  of  database  operations  in  this  thesis  specify  functions  from  some  set 
of  arguments  to  the  set  of  database  states.  The  definitions  do  not  specify  a 


specific  computer  Implementation  of  the  data  models  discussed.  The  basic 
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constructs  of  the  notation  are  similar  to  the  applicative  language  of 
[Dennis  et.  al.]  and  could  he  equivalently  represented  in  lambda  calculus 
notation  [McCarthy],  The  various  type  domains  are  specified  In  Figure  A 1.1  of 
Appendix  1.  A Pascal-like  notation  [Wirth]  is  used  to  select  elements  from 
tuples  (i.e.  like  Pascal  records).  We  will  not  specify  the  semantics  of  the 
notation  further. 

Figure  A1.2  specifies  certain  functions  which  are  regarded  as  predefined. 
Several  of  these  are  based  on  an  assumed  ordering  of  the  sets  of  role  names  and 
association  type  names.  We  will  also  assume  there  is  a special  value  error  in 
every  type  domain  and  that  any  function  which  is  given  error  as  an  argument 
will  evaluate  to  error. 

5.3.1  Insert-md-entity 

Figure  A 1.5  defines  the  operation  type  which  is  used  to  add  Independent 
entities  to  the  database  state.  The  function  Insert  tnd  entity  maps  a graph  schema, 
a graph  state  and  an  entity  description  into  a graph  state.  The  auxiliary 
function  valid-entity  defined  in  Figure  A 1.3  is  used  to  check  that  the  entity 
description  is  in  proper  form  and  that  there  does  not  already  exist  an  entity  In 
the  database  state  with  the  same  key  characteristics. 

If  these  tests  are  passed  and  if  the  entity  type  is  truly  independent,  i.e.  it 
is  not  a mandatory  role  for  any  association  type,  then  lnsert-ind-entity  evaluates  to 
a database  state  which  is  identical  to  the  argument  state  except  for:  1).  the 
addition  of  the  entity  id  to  the  set  of  entity  node  identifiers,  11).  the  addition  to 
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the  set  of  domain  element  nodes  those  domain  elements  used  to  define  the 
entity  characterise  ics,  and  iii).  the  addition  of  those  characteristic  edges  which 
connect  the  entity  node  and  the  domain  elements. 

For  the  reader's  convenience,  Figures  2.1  and  2.2  are  repeated  as  Figures 
5.3  and  5.4.  These  figures  will  be  used  in  demonstrating  operations  in  this 
chapter.  All  example  operations  will  be  specified  using  both  a graphical 
notation  and  a formal  set  notation  which  is  appropriate  to  the  definitions  in 
Appendix  1. 

Figure  5.5  shows  the  database  state  resulting  from 
Insert-ind-entity(schema5.3,  state5.4,  (5- 1) 


employee 


B.Belglm  35 


or  in  set  notation 
Insert-ind-entity(schema5.3,  state5.4, 

(5-1) 

((employee,{(name,B.Belgim)}),{(name,B.Be]gim),(age,35)})) 

where  schema5.3  represents  the  schema  of  Figure  5.3  and  state5.4  represents  the 
state  of  Figure  5.4.  Domain  names  are  left  out  for  brevity.  As  previously 
mentioned,  functions  are  denoted  in  operation  specifications  by  a set  of  ordered 
pairs. 


— 1 
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employee  operation  machine 

name^/*  ^\^afle  number^/*  N'\^type 

names  years  serial -numbers  machine-types 

Figure  5.3  - Semantic  Graph  Schema  (same  as  Figure  2.1) 


employee  < operation  < Qfr.lPCt  machine 


name 

/ 

T.Hanhart 


age 

32 


employee  < 3flPHt  operation  < CtLlPCt  machine 
narne^/^  '''X^age  number/^  ''X^type 

C .Gershag  40  JCL181  press 

employee 

/ \ 

name  / \ age 

G.Wayshum  50 

Figure  6.4  - Semantic  Graph  State  (same  as  Figure  2.2) 
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employee  operation  < ffMwt  machine 

name^/*  ^\^age  number^/^  NS\type 

T Manhart  32  MZ745  lathe 


employee <-9fl?Ht  operation « PbJBCt  machine 


name 

/ 

C .Gershag 


age 

40 


number 

/ 

JCL181 


type 

it 

press 


employee 

X \ 

name  / \ age 

G.Wayshum  50 

employee 

/ \ 

name  / \age 

B.Belgim  35 

Figure  5.5  - Semantic  Graph  State  resulting  from  Operation  (5-1) 


5.3.2  Insert-ind  association 


Figure  A 1.6  defines  the  function  lnsert-ind-a$sociation  which  is  used  to  add 
independent  associations  to  a semantic  graph  state.  It  uses  the  function 
valid-association  defined  in  Figure  A1.4  to  make  sure  that  the  association 
description  is  of  proper  form  and  that  an  association  with  identical  key 
characteristics  and  restricted  roles  is  not  already  in  the  database  state.  In 
addition  to  these  tests,  the  other  basic  requirement  is  that  all  of  the  entities 
which  participate  in  the  association  must  already  be  in  the  database  state.  If  all 
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of  these  requirements  are  met,  then  the  state  is  returned  with  appropriately 
inserted  association  nodes,  domain  nodes,  role  edges  and  characteristic  edges. 
Figure  5.6  shows  the  database  state  resulting  from 


lnsert-ind-association(schema5.1,  state5.2, 


(5-2) 


supervision 


agent 


object 

employee 

name 
B.Belgtm 

employee 


name 
G.Vayshum 


or  in  set  notation 


(5-2) 


lnsert-ind-association(schema5. 1 . states. 2, 

((supervision,  {(agent,(employee,{(name.G.Wayshum)})), 

(object.(employee,{(  name, B.Belgtm )}))},{}), 
{(agent,(employee,{(name,G.Wayshura)})) 

(object, (employee, {(name.B. Be  lgim)) ))),{))) 

where  schema5.1  is  the  schema  of  Figure  5.1  and  state5.2  Is  the  state  of  Figure 

5.2. 


Figure  5.6  - Semantic  Graph  State  resulting  from  Operations  (5-2) 
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5.3.3  Insert-semantic-unit 

The  functional  definitions  for  operations  which  insert  semantic  units  are 
more  complicated  than  those  discussed  so  far.  Figures  A 1.7,  A 1.8,  A 1.9  and 
A1.10  define  the  necessary  functions.  The  function  Insert-semantic-unit 
corresponds  to  the  function  which  would  be  visible  to  a database  system  user. 
This  function  requires  as  arguments  a schema,  a state  and  an  arg-list. 

The  arg-list  consists  of  entity  descriptions  and  association  descriptions  for 
all  of  the  components  of  the  semantic  unit.  The  way  in  which  these 
descriptions  are  ordered  in  the  list  is  not  significant  in  terms  of  the  application. 
However,  an  ordering  is  chosen  for  the  purpose  of  defining  the  operation  type 
functions. 

The  first  element  in  the  arg-list  is  an  arbitrarily  chosen  entity  in  the 
semantic  unit.  From  this  starting  point,  the  ordering  of  the  arguments 
corresponds  to  a depth  first  traversal  of  portions  of  the  schema  along  role  set 
edges  beginning  from  the  entity  set  node  corresponding  to  the  first  entity  in  the 
list.  For  each  association  set  node,  all  edges  are  traversed.  For  each  entity  set 
node  or  association  set  node  which  is  visited,  an  appropriate  entity  or  association 
description  is  added  to  the  arg-list  unless  the  description  has  been  added  earlier 
in  the  list  or  if  the  particular  entity  or  association  already  exists  in  the 
database  state. 


The  function  Insert-semantic-unit  initiates  the  mutually  recursive  calls  of  the 
functions  insert-mand-entity,  insert-mand-association  and  check-roles  which  traverse  the 
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schema  and  do  the  insertions.  The  functions  valid-entity  and  valid-association  are 
also  used  to  make  sure  that  the  descriptions  are  acceptable. 

Figure  5.7  shows  the  database  state  resulting  from 

Insert-semantic-unit(schema5.3,  state5.4,  (5-3) 


employee  operation < machine  ) 


G.Wayshum  AB982  lathe 

or  in  set  notation 

Insert-semantic-unit(schema5.3,  state5.4, 

((machine, {(number,AB982)})f{(number>AB982), (type, lathe)}), 

(5-3) 

((operation,  {(obJect,(machine,{(number,AB982)}))},{}), 

{(agent, (employee. {(name,  G.Wayshum)})), 
(object,(machine,{(number,AB982)}))},  {}))). 

In  this  case,  the  traversal  of  the  schema  (Figure  5.3)  starts  at  the  machine 
entity  set  node.  So  the  first  element  in  the  arg-list  is  the  description  of  the 
machine  entity  being  inserted.  The  first  (and  only)  mandatory  role  set  edge 
terminating  on  the  employee  entity  set  node  is  the  object  role  of  the  operation 
association.  So  this  edge  is  followed  and  the  next  element  of  the  arg-list  is 
the  association  description  which  Includes  the  previously  defined  machine  entity 
in  the  object  case.  Since  all  entities  participating  in  this  association  are  already 
either  in  the  state  or  in  the  arg-list  this  concludes  the  traversal. 


- 77  - 


employee  *agEnt  operation  machine 

name^/^  ^X^age  number^/*  ''X^ype 

T.Manhart  3 2 NZ745  lathe 


employee  <-aget^  operation  «-Pkl6Ct  machine 
name^/^  ^^X^age  number^/*  V''S\^type 

C.Gershag  40  0CL181  press 

employee  <-aflCnt  operation  < OklOCt  machine 
name^/^  ^X^age  number^/*  ''X^type 

G.Wayshum  50  AB982  lathe 

Figure  5.7  - Semantic  Graph  State  resulting  from  Operation  (5-3) 


5.3.4  Delete-ind-entity 

The  operation  type  used  to  delete  independent  entities  is  defined  by  the 
function  Delete-ind-entity  in  Figure  Al.ll.  The  basic  requirement  that  must  be 
fulfilled  by  the  entity  to  be  deleted  is  that  it  must  not  be  in  any  associations. 
This  requirement  implies  the  independence  of  the  entity. 

The  operation 


Delete-ind-entity(schema5.3,  state5.4, 


employee  ) 


name 


(5-4) 


G.Wayshum 
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or  in  set  notation 

Delete-ind-entity(schema5.3,  state5.4, 

(employee,  {(name.G.Wayshum)})), 

results  in  the  database  state  shown  in  Figure  5.8. 


(5-4) 


employee  < operation  < Pb.lOCt  machine 

name^/^  ^\^age  number^/^  '^'^type 

T.Manhart  32  NZ745  lathe 


employee  < operation  < Pb-lMt  machine 

name^/^  N\^age  number^/^  ^\^type 

C.Gershag  40  JCL181  press 


Figure  6.8  - Semantic  Graph  State  resulting  from  Operation  (5-4) 


5.3.5  Delete-ind -association 

The  operations  to  delete  independent  associations  are  defined  by  the 
function  Delete-ind-association  in  Figure  A 1.1 2.  A check,  is  made  to  make  sure 
that  this  association  is  in  fact  independent. 

If  state5.6  represents  the  graph  state  shown  in  Figure  5.6,  then 


Delete-ind-association(schema5. 1 , states. 6, 


(5-5) 


> supervision 

agent 

object 

employee  ) 

name 

B.Belglm 

employee 

1 1 1 

name 

G.Wayshum 

or  in  set  notation 

Delete-ind-association(schema5.1,  state5.6, 

(5-5) 

(supervlsian,{(agent,(employee,{(name,G.Wayshum)})), 

(object, (employee,{(name,B.Belglm)}))})) 

results  in  the  state  shown  in  Figure  5.2. 

5.3.6  Delete-semantic-unit 

To  delete  a semantic  unit,  the  function  Delete-semantic-unit  with  its 
auxiliary  functions  delete-roles  and  delete-mand-entity  are  used.  These  functions  are 
defined  in  Figures  A1.13,  A1.14  and  A1.15.  In  addition  to  the  schema  and 
state,  the  identification  of  any  arbitrarily  chosen  association  in  the  semantic 
unit  are  given  as  arguments  to  Delete-semantic-unit.  The  mutually  recursive 
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functions  then  delete  each  entity  or  association  in  the  semantic  unit  in  the 
’'reverse"  of  the  ordering  used  by  Insert-semantic-unit. 

The  following  results  in  the  graph  state  shown  in  Figure  5.4: 


Delete-semantic-unit(schema5.3,  state5.7. 


or  in  set  notation 


operation  ♦ PMftCt  machine 


AB98? 


(5-6) 


De1ete-semantic-unit(schema5.3,  state5.7, 

(operation, {(object,(machine,{(number,AB982)}))},{})). 


(5-6) 


This  completes  the  presentation  of  the  functions  which  define  the  operation 
types. 


5.4  Definitions  of  Valid  Operations  and  Valid  States 

Now  we  will  draw  the  distinction  between  operation  types  and  operations. 
The  functions  presented  in  Appendix  1 define  the  six  operation  types.  Each 
operation  type  is  a function  which  takes  as  arguments  a semantic  graph  schema, 
a semantic  graph  state  and  one  other  argument  depending  on  the  operation  type. 
However,  we  want  an  operation  to  be  a function  which  maps  one  database  state 


i 


into  another  database  state. 


The  set  of  valid  operations  will  be  necessary  for  the  study  of  equivalence 
properties  in  later  chapters  of  this  thesis.  We  will  first  show  how  such  a set 
of  operations  is  defined  for  the  Insert-ind-entity  operation  type.  For  a given 
choice  of  base  finite  sets  E,  A,  D.  R,  C and  Str,  the  set  of  possible  entity 
descriptions,  E-desc,  is  finite.  Then  define  the  set  of  possible  Independent 
entity  insertion  operations  for  a given  semantic  graph  schema  g-scheman  as  the 
set  of  functions 

operationj  (ff-stotesG-state) 
returns  G-state 

return  Insert-ind-entity (g-scheman,ff-state,e-desc  j) 
where  e-desc  j < E-desc. 

The  operations  for  the  other  operation  types  are  defined  similarly. 
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Definition  5.1;  For  a given  g-scheman  < G-schema,  the  set  of  valid  operations  is 
the  union  of  the  following  sets  for  1 < • <6: 

{operational  jf-s  tote:  G-9t  ate) 
returns  G-state 

return  Op-typej(g-scheman,0-state,argj) } 
where  for: 

i - Is  Op-typej  - lnsert-ind-entity 
arg-  c E-desc 

i - 2:  Op-typej  = Insert-ind-association 
argj  c A-desc 

i =3:  Op-typej  = Insert-semantic-unit 
arg j « Arg- list 

i - 4:  Op-typej  - Delete-ind-entity 
ar g j < E-id 

i - 5:  Op-typej  - Delete-ind-association 
argj  < A- id 

i - G:  Op-typej  - Delete-semantic-unit 
argj  < A-id 

Finally,  we  would  like  to  define  the  set  of  valid  semantic  graph  states  for 
a given  application  model.  These  aTe  those  elements  of  the  set  G-state  which 
represent  possible  states  of  the  application.  For  purposes  of  this  study,  we  will 
assume  that  the  only  allowed  initial  state  is  the  empty  state. 


I 
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Definition  5.2:  The  set  of  valid  semantic  pjraph  states  for  a given  semantic 
graph  schema  g-schema  and  set  of  valid  operations  G-operat  ions,  where 
G-operations  is  defined  for  g-schema  as  in  Definition  5.1,  is  defined  as  follows: 

1.  The  empty  state  * (0,0,  0,0,0)  is  a valid  semantic  graph  state.  j 

2.  If  g-state  is  a valid  state,  g-op  ( G-operations  and  g-op(g-state)  * 

error, 

then  g-state’  - g-op  (g-state)  is  a valid  semantic  graph  state. 
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Chapter  6 - The  Semantic  Belation  Data  Model 

As  previously  stated,  the  semantic  relation  data  model  Is  based  on  the 
concept  that  the  user  sees  the  database  state  as  sets  of  statements  about  the 
application  state.  In  describing  the  semantic  relation  data  model  we  will  make 
the  assumption  that  the  application  and  application  state  are  formally  specified 
by  the  corresponding  semantic  graph  application  model  and  semantic  graph 
database  state  respectively. 

6.1  Semantic  Relation  Statements 

The  meaning  of  a statement  (i.e.  row  or  tuple)  in  a relation  can  be  given 
by  a logical  statement  about  the  application  state.  Several  examples  will  be 
given  which  will  both  clarify  this  idea  and  show  the  "expressive  power"  of  the 
semantic  relation  data  model.  The  user  of  a semantic  relation  database  system 
would  not  deal  explicitly  with  the  logical  statements  as  will  be  presented  here. 
We  will  use  "row”,  "statement"  and  "tuple"  lnterchangably. 

For  the  examples  to  follow  we  will  consider  Figures  6.1,  6.2,  6.4  and  6.5 
which  are  identical,  with  the  exception  of  some  italicized  words,  to  Figures  2.3, 
2.4,  2.6  and  2.7.  The  significance  of  the  italicized  words  in  the  relations  will 


(■ 


PRBCEDXNO  PAGX  BUMC-NOT  KUttD 


be  explained  shortly. 
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Using  the  notation  which  defines  the  semantic  graph  data  model  as  in 
Chapter  5 and  Appendix  1,  we  will  associate  with  the  first  row  of  the  relation 
in  Figure  6.1  the  following  logical  statement: 


Ga-id:A-id)  Ge-td-isE-id)  (3e-td-2tE-id) 

1 a-id  < state. A 
a e-id-1  c state. E 
a e-id-2  ( state. E 
a o-td. type  « operation 
a e-id-1. type  - employee 
a e-id-2. type  - machine 
a state. R(a-id, agent)  • e-id-1 
a state. R(a-id, ob ject)  - e-id-2 
a state. Cie-id-1, name)  ■ (names, T.ttanhart) 
a state. Cie-id-2, number)  » (serial -number s,NZ7A5) 
a state. C(e-id-2, type)  - (machine-types, lathe)  ). 


operate: agent 

operate  -.object 

employee 

machine 

name 

number 

type 

names 

serial -numbers 

machine-types 

T.Nanhart 

NZ745 

lathe 

C.Gershag 

JCL181 

press 

Figure  6.1  - Semantic  Relation 


This  asserts  that  there  exist  in  the  application  state  an  association  and  two 


entities  with  the  specified  characteristics. 


Note  that  a tuple  concerning  the  "operate"  predicate  was  translated  into  a 
logical  statement  concerning  an  "operation"  association.  This  points  out  that  the 
relation  rows  are  to  be  thought  of  as  natural  language  statements  given  in  a 
case  grammar  form.  In  fact,  we  will  assume  that  the  set  of  possible  cases  is 
Identical  to  the  set  of  possible  roles.  For  each  association  we  will  assume  there 
is  a unique  corresponding  predicate.  However,  there  is  one  group  of  predicates 
which  do  not  correspond  to  associations. 

For  each  type  of  entity  in  the  application,  there  is  a predicate  "be  type " 
where  type  is  the  entity  set  name.  An  example  of  a relation  using  such  a 
predicate  is  in  Figure  6.2.  This  type  of  predicate  is  used  to  state  the  existence 
of  an  entity  as  opposed  to  the  existence  of  an  association.  With  the  first  row 
in  Figure  6.2  we  can  associate  the  logical  statement 

(3e-td:E- i d) 

( e-id  c state. £ 
a e-id. type  - employee 
a state. Cle-id, name)  - (names, T.Hanhart) 
a state. C(e-id, age)  » (years, 32)). 

Note  that  the  above  logical  statement  asserts  the  existence  of  an  employee 
named  "T.Manhart”,  but  it  does  not  state  that  there  is  only  one  such  employee. 
This  Information  would  be  derived  from  a relational  constraint  stating  that  the 
relation  in  Figure  6.2  is  a function  from  the  values  in  the  first  column  to  the 
values  in  the  second  column.  That  is,  the  names  in  the  first  column  are 
unique.  With  such  a constraint,  using  "31”  to  mean  "there  exists  exactly  one", 


we  would  state 


be  employee  -.object 


employee 

name 

age 

names 

years 

T.Manhart 

32 

C.Gershag 

40 

G. Way shorn 

50 

Figure  6.2  - Semantic  Relation 


(31  e-id: E-id) 

( e-id  c state. E 
a e-id.  type  - employee 
a state. C(e-id, name)  - (names, T.Manhart) 
a’ state. C(e-ttf, age)  * (years,32)). 


This  states  that  there  Is  a unique  employee  named  "T.Manhart"  and  of  age  "32". 
We  might  also  add  to  the  logical  statement  that  "name"  Is  the  identifying  (hey) 
characteristic. 

The  Information  necessary  to  Know  that  a relational  statement  concerns  • 
unique  entity  or  association  is  contained  In  the  constraints.  However,  it  Is  such 
a useful  piece  of  Information  when  interpreting  the  meaning  of  a relation,  that 
we  will  Introduce  a "sugaring”  to  express  the  information  In  the  relation  itself. 


This  Is  the  purpose  of  the  ltallcization  In  Figures  6.1  and  6.2. 

For  explanatory  purposes  we  need  the  concept  of  a column-group.  By 
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column-group  we  will  mean  the  subset  of  a relation's  columns  which  contains 
all  of  the  characteristics  described  in  the  relation  for  a given  predicate  or  set  of 
predicate:case  pairs.  In  Figure  6.1,  the  employee  "name”  column  forms  one 
column-group.  The  machine  " number " and  "type"  columns  form  the  second 
column-group  of  this  relation. 

If  all  of  the  key  characteristics  of  an  entity  type  are  present  in  a single 
column-group,  then  the  key  characteristic  names  in  the  relational  heading  will 
be  italicized.  This  is  the  reason  for  the  italicized  "name"  and  "number"  in 
Figure  6.1.  If  some,  but  not  all  of  the  key  characteristics  of  an  entity  type  are 
present,  none  of  the  characteristic  names  for  the  column-group  will  be 
italicized.  This  will  clearly  delineate  those  cases  in  which  unique  entities  are 
referenced. 

A similar  scheme  will  be  adopted  for  the  predicates  in  the  relations.  If, 
for  a given  predicate  in  a relation,  all  of  the  key  characteristics  are  present  and 
all  of  the  key  characteristics  for  each  restricted  case  (role)  are  present,  then 
the  key  characteristic  names  and  restricted  case  names  will  be  italicized.  This 
accounts  for  the  Italicized  " object " in  Figure  6.1.  If  any  of  the  identifying 
characteristics  are  missing,  then  none  of  the  relevant  case  names  or 
characteristic  names  will  be  italicized. 

This  italidzatlon  is  not  directly  reflected  in  the  formal  specification  of  the 
relation  headings.  Such  information  is  specified  by  the  constraints.  The  reason 
for  expressing  this  information  in  terms  of  the  constraints  Is  to  maintain  greater 
generality  in  the  semantic  relation  data  model. 


While  the  italicization  scheme  is  perfectly  acceptable  for  describing 
applications  as  specified  by  semantic  graph  application  models,  we  can  imagine 
features  of  applications  which,  while  not  expressible  in  the  semantic  graph 
model,  would  be  expressible  in  the  semantic  relation  model  with  appropriate 
types  of  constraints.  In  terms  of  the  problem  of  uniquely  identifying  entitles 
and  associations,  we  might,  for  example,  desire  multiple  identifiers  for  a single 
entity  or  association.  These  problems  of  comparative  expressiveness  of  the  date 
models  will  be  explored  further  in  succeeding  chapters  of  this  thesis. 

It  is  very  important  to  realize  that  the  use  of  Italics  In  this  thesis  Is  not 
Identical  to  the  notion  of  relational  keys  in  [Codd  72a].  In  particular,  the  set 
of  all  italicized  characteristics  do  not  necessarily  comprise  a relational  key.  For 
example.  Figure  6.3  shows  a relation  describing  a state  in  which  a unique 
employee  name  "J.Doe"  operates  two  types  of  machines.  The  employee  " name " 
column  does  not  form  a key  to  this  relation.  Also  note  that  Figure  6.3  does  not 
rule  out  the  possibility  that  "J.Doe"  operates  several  machines  of  type  "drill". 

The  relational  key  of  Codd  provides  a means  of  uniquely  identifying 
relational  tuples  in  the  database  state.  Italicization  <and  constraints)  in  the 
semantic  relation  data  model  provides  a means  of  uniquely  identifying  entities 
and  associations  in  the  application  state.  At  times  the  goals  of  relational  keys 
and  italicization  coincide.  In  fact,  many  of  the  benefits  of  relational 
normalization  theory  [Codd  72a]  result  from  structuring  relations  so  that  single 
tuples  correspond  to  single  objects  in  the  application  state. 

The  use  of  relations  to  express  a statement  concerning  a single  association 
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operate: agent 

operate: object 

employee 

machine 

name 

age 

type 

names 

years 

machine-types 

J.Doe 

25 

drill 

J.Ooe 

25 

lathe 

Figure  6.3  - Italiclzatioa  Not  the  Same  as  Relational  Keys 


is  the  basic  type  of  semantics  associated  with  the  use  of  relations  in  [Chen]  and 
[Hall  et.  al.].  However,  the  model  presented  here  is  unique  In  its  ability  to 
express  more  complex  statements.  For  example.,  the  first  row  of  the  relation  in 
Figure  6.4  is  associated'  with  the  following  logical  statement: 
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<3a-id:A-id)  (3e-id-I:E-id)  <3e-td-2:E-ld> 

( a-id  c state. A 
a e-id-1  c state. £ 
a e-id-2  t state. E 

a a-id. type  - operation 
a e-id-1. type  - employee 
a e-id-2,  type  - machine 
a state. Ria-id, agent)  - e-id-1 
a state. Rla-id, object)  - e-id-2 
a state. C(e-td-l, name)  » (names. T.Manhart) 
a stole. C(e-id-2, number)  - (ser ial -number s,NZ745)  ) 
A (3a-td:  A- id)  (3«-id-lsE-id)  (3e-id-2iE-id) 

( a-id  c state. A 
a e-id-1  t state. E 
a e-id-2  c state. E 
a a-id. type  - supervision 
a e-id-l.type  » employee 
a e-id-2,  type  - employee 
a state. R(a-id, agent)  - e-id-1 
a state. Rla-id. object)  - e-id-2 
a state. Cle-id-1, name)  - (names, G.Uay ahum) 
a state. Cle-id-2, name)  - (names, T.Manhart)  ) 


Obviously,  such  logical  statements  become  extremely  long  for  larger  relations. 
One  attractiveness  of  the  tabular  form  of  such  relations  is  the  compactness  of 
expression  of  such  logical  statements. 

One  feature  of  the  preceding  logical  statement  is  that  it  does  not  claim 
that  the  employee  named  "T.Manhart"  in  the  supervision  association  is  the  same 
employee  as  the  one  named  T.Manhart"  in  the  operation  association.  As  in  the 
previous  case  involving  the  problem  of  uniqueness,  the  existence  of  constraints 
on  the  relations  would  allow  us  to  rewrite  the  previous  logical  statement  to 
show  that  the  two  employees  are,  in  fact,  identical. 

Another  unique  feature  of  the  relational  data  model  presented  here  is  the 
use  of  the  null  value  ("•-")  in  the  relations.  An  example  of  the  logical 
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interpretation  of  a relation  row  using  this  value  is  the  following  logical 
statement  corresponding  to  the  last  row  in  the  relation  of  Figure  6.5 

(31  e-idsE-id) 


( e-id  c state. E 

e-id. type  - employee 
state. C(e-id, name)  - (names, G.Uayshum) 
state. Cle-id, age)  - (years, 50) 

-•  (3e-td:A-id) 

( a-id  ( state./) 
a a-id. type  - operate 
a state. Rla-id, agent) 


e-id)  ). 


Figure  6.5  - Semantic  Relation  with  Null  Value 


- 95  - 


6.2  The  Semantic  Relation  Application  Model  Schema 


The  relational  schema  describes  the  headings  of  the  relations  as  they  have 
been  shown  in  the  figures  and  also  states  the  constraints  on  the  statements 
allowed  in  the  relations.  In  set  theoretic  terms,  a semantic  relation  schema  is 


(Rel,  Con,  R-names,  P,  S,  E,  D,  C,  sd,  Str) 


where  the  components  are  defined  as  follows: 


Rel:  This  is  a function 

ReliR-names  -♦  Heading 

which  maps  relation  names  into  a heading.  A heading  is  a function  of  the  type 
Heading:  ( (Power  set’ (P  x S)  x E)  UP)  -*  [C  -*  D1 
where  Pouerset’  is  the  set  of  all  subsets  except  the  empty  set.  E is  the  set  of 
entity  set  names  as  used  in  the  semantic  graph  model.  S is  the  set  of  possible 
semantic  case  names  which  we  will  assume  identical  to  the  set  R of  role  names 
in  the  semantic  graph  model.  C is  the  set  of  characteristic  names  and  D is  the 
set  of  domain  names  as  in  the  semantic  graph  model.  P is  the  set  of  predicates 
which  contains  a predicate  for  each  association  type  in  the  graph  model  plus  a 
"be"  predicate  for  each  element  of  E. 

The  heading  function  specifies  the  "levels”  of  information  shown  in  the 
relational  headings.  The  top  level  is  a set  of  predicate:case  pairs  or  else  a single 
predicate.  Then  the  entity  type  is  specified  for  the  predicate:case  pairs.  Next 
the  characteristics  being  given  are  stated.  Each  characteristic  corresponds  to  a 
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single  column  In  the  relation's  rows.  Finally  the  domain  of  the  characteristic  is 
specified.  It  Is  also  required  that  no  predlcatetcase  pair  may  appear  In  two 
different  predlcate:case  sets  which  are  in  the  domain  of  a relation's  heading 
function.  That  is,  a predlcate:case  pair  may  appear  only  once  in  the  heading. 

Con:  This  is  the  set  of  constraints.  Each  constraint  maps  the  database 
state  into  a Boolean  value  stating  whether  the  state  satisfies  the  constraint. 
Further  description  of  constraints  will  be  deferred  to  Chapter  9.  This  is  done 
so  that  the  constraints  can  be  introduced  In  a coherent  manner  based  on  the 
work,  presented  in  Chapter  8. 


sd:  This  function  gives  domain  specifications  and  is  identical  to  the  ed 
used  in  the  semantic  graph  schema  with  the  exception  that  in  the  relational 
model  the  null  value,  " — ",  is  defined  to  be  a valid  element  of  every  domain. 
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6.3  The  Semantic  Relation  Database  State 

One  of  the  prime  advantages  claimed  for  existing  "relational-type"  data 
models  is  the  conceptual  simplicity  of  a user  seeing  the  database  state  in  tabular 
form.  This  same  feature  can  be  seen  in  the  relative  simplicity  of  the  definition 
of  the  semantic  relation  database  state. 

The  state  is  a function  from  each  relation  name  into  a set  of  tuples  (rows) 
which  correspond  to  the  statements  expressed  by  the  relation.  Formally,  the 
state  is  a function 

R-namea  -♦  Power  set  (Tuples) 

where  Tuples  is  the  set  of  all  possible  tuples  and  is  defined  as 

[( (Power set*  (P  x S)  x E)  1)  P)  -*  IC  -*  Domain-element! ) 
where  Domain-element  is  defined  as  in  the  graph  model  except  for  the  addition 
of  the  null  value. 

The  above  definition  simply  says  that  in  each  row  there  is  a domain 
element  in  each  column.  The  Insert-tuples  operation  (to  be  defined  later)  must 
guarantee  that  this  domain  element  is  in  the  proper  domain  for  the  column  as 
specified  by  the  schema.  Note  that,  formally,  the  relation  heading  is  present  in 
each  tuple. 

One  other  restriction  which  must  be  enforced  by  the  operations  is  that 
there  is  no  row  consisting  completely  of  null  values.  In  the  case  where  there 
are  no  associations  or  entities  in  the  application  state  which  is  described  by  the 
relation,  we  will  instead  let  the  relation's  state  be  the  empty  set. 


Figure  6.6  shows  one  possible  set  or  relational  headings  and  the  relational 
state  corresponding  to  the  semantic  schema  and  state  shown  In  Figures  5.1  and 
5.2.  The  constraints  for  this  example  will  be  presented  in  Chapter  9. 
Remember  that  in  this  example  both  a number  and  a type  are  necessary  to 
Identify  a machine  while  both  machine  and  shift  are  necessary  to  Identify  an 
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Machine-age 


Locat lons-used 
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50 

... 

••• 

Figure  6.6  - Semantic  Relations 
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6.4  The  Semantic  Relation  Operation  Types 

There  are  only  two  operation  types  In  the  semantic  relation  data  models 
Insert-tuples  and  Delete-tuples.  These  operation  types  either  add  sets  of  statements 
or  delete  sets  of  statements  from  relations.  A single  operation  may  Insert  or 
delete  any  finite  number  of  tuples  into  any  number  of  relations  In  the  database 
state.  Even  though  we  speak  of  "inserting"  and  "deleting"  tuples,  it  is 
Important  to  remember  that  the  operations  are  defined  from  a purely  functional 
view. 

The  notation  used  to  describe  the  relational  operations  is  the  same  as  that 
used  in  the  definition  of  the  semantic  graph  operations.  Formal  specifications 
are  given  in  Appendix  2.  The  relevant  type  domains  are  defined  in  Figure  A2.1 
of  Appendix  2.  The  predefined  functions  and  special  notation  are  given  in 
Figure  A2.2.  The  operations  are  defined  in  Figures  A2.3  - A2.8.  It  can  be  seen 
that  the  operations  are  quite  a bit  less  complex  than  the  semantic  graph 
operations.  The  ordering  definitions  in  Figure  A2.3  will  be  explained  later  in 
this  chapter. 

6.4.1  Insert-tuples 

The  predicate  function  valid-rels  defined  in  Figure  A2.4  makes  sure  that  the 
proposed  tuples  to  be  added  or  deleted  are  of  a proper  form  as  specified  by  the 
schema.  This  Includes  checking  that  no  tuple  consists  of  ell  null  values. 

The  operation  type  Insert-tuples  defined  in  Figure  A2.5  maps  a schema,  a 
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state  and  the  tuples  to  be  added  into  a new  relational  state  or  the  error  value. 
An  error  results  If  the  tuples  to  be  added  are  not  of  the  proper  form.  If  any  of 
the  tuples  to  be  added  already  exist  in  the  state,  or  if  any  of  the  constraints 
evaluate  to  false  when  applied  to  the  relations  consisting  of  the  unions  of  the 
previous  state's  relational  tuples  and  the  tuples  to  be  added.  Rather  than  simple 
set  union,  the  function  s-union  ("statement  union")  defined  in  Figure  A2.6  is 
used.  The  way  in  which  s-union  differs  from  ordinary  set  union  will  be 
explained  shortly. 

Figure  6.7  shows  the  relations  of  Figure  2.5.  The  relational  state  shown 
in  Figure  6.8  results  from 
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Insert-tup1es(schema6.7,  state6.7, 
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or  in  set  notation 
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Employees 
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Figure  6.7  - Complete  Semantic  Relation  State 
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be  employee: object 
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Figure  6.8  - Semantic  Relations  resulting  from  Operation  (6-1) 
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Insert-tuples(schema6.7,  state6.7, 

{(Operate, {(({(operate, agent)}, employee), {(name, G.Wayshum)}), 
(( {(be  machine.object), (operate, obJect)},machlne), 
{(number,AB982),  (type,  lathe)})})}) 


(6-1) 


where  schema6.7  Is  the  relational  schema  and  state6.7  is  the  relational  state  of 
Figure  6.7.  Figure  5.7  does  not  fully  define  the  schema  since  the  constraints 
are  not  presented  there.  We  will  present  the  constraints  later  and  assume  that 
none  of  the  operations  shown  here  violate  those  constraints. 

This  operation  Inserts  a single  tuple  Into  the  Operate  relation.  Since  the 
resulting  state  Is  "state  equivalent"  (to  be  defined  later)  to  the  graph  state  of 
Figure  5.7,  operation  6-1  is  "operation  equivalent"  (also  to  be  defined  later)  to 
operation  5-3.  Note  that  while  we  use  a colon  in  the  text  and  figures  to 
separate  predicates  and  cases  in  predlcate:case  pairs,  when  specifying  operations 
we  will  denote  predicate:case  pairs  as  ordered  pairs,  e.g  (operate, agent).  Also 
note  that  the  formal  specification  in  Appendix  Z for  Insert-tuples  (and 
Delete-tuples)  requires  the  "insertion"  (deletion)  of  the  empty  set  of  statements 
into  each  relation  for  which  no  change  is  desired.  We  will  not  explicitly  show 
this  In  either  of  our  operation  notations. 

The  relational  state  shown  In  Figure  6.9  is  the  result  of 


Insert-tuples(schema6.7,  state6.7, 


Employees 


be  employee: object 

employee 

name 

age 

names 

years 

B.Belglm 

35 

or  in  set  notation 
Insert-tuples(schema6.7,  state6.7, 

{(Employees,{(({(be  employee, object)},employee), 
{(name,B.Belglm),(age,35)})})}). 


(6-2) 


(6-2) 


This  Corresponds  to  operation  5-1  and  the  resulting  graph  state  shown  in  Figure 
5.5. 

There  are  several  observations  which  should  be  made  concerning  these 
examples  showing  operations  which  Insert  tuples.  Firstly,  even  though  one 
operation  above  corresponds  to  the  insertion  of  a semantic  unit  in  the  graph 
model  (6-1)  and  the  other  one  corresponds  to  the  Insertion  of  an  Independent 
entity  (6-2),  the  same  operation  type  is  used  in  the  relational  model.  Secondly, 
even  though  a single  tuple  corresponds  to  a semantic  unit  in  the  Operate 
relation  and  to  an  independent  entity  in  the  Employees  relation  of  Figures  6.1 
and  6.9,  it  may  be  the  case  that  inserting  a semantic  unit,  an  Independent 


operate: agent 


be  machine  -.object 
operate: object 


Figure  6.9  - Semantic  Relations  resulting  from  Operation  (6-2) 


107  - 


entity  or  an  independent  association  requires  the  addition  of  several  tuples. 

Finally,  it  should  be  noted  that  a single  Insert-tuples  operation  could  do  the 
equivalent  work  of  several  semantic  graph  operations.  For  example,  Figure  6.10 
shows  the  result  of 


Insert-tuples(schema6.7,  state6.7. 

Employees 


be  employee: object 

employee 

name 

age 

names 

years 

B.Belglm 

35 

(6-3) 


Operate 


operate: agent 

be  machine  -.object 
operate  -.object 

employee 

machine 

name 

number 

type 

names 

serial -numbers 

machine-types 

G.Wayshum 

AB982 

lathe 

or  in  set  notation 


■MW 


108  - 


be  employee:obJect 

employee 

name 

age 

names 

years 

T.Hanhart 

32 

C .Gershag 

40 

G. Way shorn 

50 

B.Belgim 

35 

Operate 


operate: agent 

be  machine  -.object 
operate  -.object 

emp 1 oyee 

machine 

name 

number 

type 

names 

serial-numbers 

machine -types 

T.Manhart 

NZ745 

lathe 

C.  Gershag 

JCL181 

press 

G.Wayshum 

AB982 

lathe 

Figure  6.10  - Semantic  Relations  resulting  from  Operation  (6-3) 
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Insert-t‘jples(schema6.7,  state6.7, 

{ ( Employees, { ( ( { ( be  employee, object )}  .employee), 
{(name,B.Belgim),(age,35)})}), 

(6-3) 

(Operate,{(({(operate, agent)}, employee), {(name.G.Wayshum)}), 

(({(be  machine, object),(operate,obJect)},machlne), 
{(number.AB982),(type, lathe)})})}). 


This  is  equivalent  to  the  composition  of  the  Insert-semantic-unit  and  Insert-ind-entity 
graph  operations,  5-1  and  5-3  which  created  Figures  5.5  and  5.7. 

The  following  example  will  show  how  the  s-union  function  differs  from 
ordinary  set  union.  This  difference  is  only  visible  when  tuples  with  null 
values  ("-—")  are  in  the  relations.  If  this  is  the  case,  then  it  might  be  desired 
to  add  to  the  relation  a statement  which  is  identical  to  some  existing  statement 
in  the  relation  with  the  exception  that  some  of  the  values  which  were  null  in 
the  existing  statement  are  non-null  in  the  added  statement.  When  this  is  the 
case,  the  existing  tuple  is  replaced  by  the  added  tuple.  That  is,  the  existing 
tuple  is  automatically  deleted. 

Semantically,  this  means  that  a statement  of  the  form  "there  is  no 
entity( association)  such  that  . . is  automatically  deleted  when  the  statement  lx 
made  false.  For  example,  Figure  6.11  is  the  result  of 
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Insert-*upl«(schema6.6,  state6.6, 

Superv 1 se -opera te 


(6-4) 


supervise: agent 

be  employee: object 
supervise: object 
operate :agent 

operate: object 

employee 

employee 

machine 

name 

name 

age 

number 

type 

names 

names 

years 

serial -numbers 

machine- types 

G.Wayshum 

B.Belglm 

35 

NZ745 

lathe 

or  In  set  notation 

Insert-tuples(schema6.6,  state6.6, 

{(Supervise-operate,{(({(supervise,agent)},employee),{(name,G.Wayshum)}), 
(({(be  employee, obJect),(supervlse,obJect),(operate,agent)}, 

(6-4) 

employee), {(name,B.Belglm),(age, 35)}), 
(({(operate, object)}, machine),  {(number  ,NZ745),(type,lathe)})))}). 

This  operation  Is  equivalent  to  the  addition  of  the  Independent  association  in 
operation  5-2  which  resulted  in  Figure  5.6. 

This  type  of  operation  might  be  better  understood,  and  is  formally  defined, 
in  terms  of  a partial  ordering  of  statements.  This  ordering  is  used  in  the 
s-union  and  s-di  (Terence  functions.  The  ordering  says  that  a statement  asserting 
the  existence  of  an  entity  or  association  takes  precedence  over  ("is  greater 
than",  >)  a statement  asserting  that  no  such  entity  or  association  exists.  So 
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machine 
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Figure  6.11  - Semantic  Relations  resulting  from  Operation  (6-4) 
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when  a statement  asserting  existence  is  added  to  a relation,  the  statement 
asserting  the  non-existence  is  automatically  deleted. 

The  ordering  of  statements  is  formally  defined  In  Figure  A2.3.  This 
ordering  is  based  on  the  definition  that  within  a given  domain,  all  non-null 
values  are  greater  than  the  null  value.  Furthermore,  non-null  values  are 
incomparable  to  any  value  other  than  the  null  value  and  itself  (to  which  it  la 
equal). 

Extending  this  to  tuples  with  equal  headings,  we  define,  for  example,  that 
the  tuple 


supervise :agent 

supervise  -.object 
operate: agent 

operate: object 

employee 

employee 

machine 

name 

name 

number 

names 

names 

serial-numbers 

G.Wayshum 

B.Belgia 

NZ745 

is  greater  than 


itlNM 


113  - 


supervise ‘.agent 

supervise: object 
operate :agent 

operate: object 

employee 

employee 

machine 

name 

name 

number 

names 

names 

serial -numbers 

— 

B.Belglm 

NZ745 

Tuple  6-5  is  Incomparable  with 


supervise : agent 

supervise  -.object 
operate :agent 

operate  -.object 

employee 

employee 

machine 

name 

name 

number 

names 

names 

serial -numbers 

G.Vayshum 

C.Gershag 

JCL181 

By  this  definition,  the  first  row  of  the  relation  Supervlse-operate  in  Figure 
6.11,  Inserted  by  operation  6-4,  takes  precedence  over  the  first  row  of 
Supervlse-operate  in  Figure  6.6. 

It  should  also  be  noted  that  whenever  a single  characteristic  in  a tuple  Is 
null,  every  other  characteristic  describing  the  same  entity  or  predicate  must  be 
null.  This  is  in  accordance  with  the  interpretation  of  the  null  value.  The 
constraints  associated  with  the  database  schema  must  assure  this  property.  The 


function  valid-rels  also  checks  that  no  two  tuples  in  a single  relation  are 
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comparable  under  this  ordering. 

6.4.E  Delete-tuples 

For  use  with  the  Delete-tuples  operation,  the  ordering  of  statements  allows 
the  comparison  of  tuples  in  which  the  predicates  in  one  tuple  are  a subset  of 
the  predicates  in  another  tuple.  This  involves  the  ordering  a«f«i><iinn  on  tuple 
headings  in  Figure  A2.3.  In  particular, 


operate: agent 

be  machine  ‘.object 
operate '.object 

employee 

machine 

name 

number 

type 

names 

serial -numbers 

machine- types 

G.Wayshum 

AB982 

lathe 

is  greater  than 
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(6-9) 


Note  that 


(6-10) 


Is  Incomparable  to  tuple  6-9.  While  the  heading  In  6-9  Is  less  than  the 
heading  in  6-10,  the  corresponding  values  in  6-9  are  not  all  less  than  or  equal 
to  those  in  6-10  (since  "G.Wayshum"  is  greater  than  " — "). 

The  operation  type  Delete-tuples  defined  in  Figure  A2.7  maps  a schema,  a 
state  and  the  tuples  to  be  deleted  into  a new  relational  state  or  the  error  value. 


Each  tuple  specified  in  the  deletion  must  be  less  than  or  equal  to  some  tuple  In 
the  database  state.  An  error  results  if  any  of  the  constraints  evaluate  to  falee 
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when  applied  to  the  relations  consisting  of  the  difference  between  the  previous 
state's  relational  tuples  and  the  tuples  to  be  deleted.  For  reasons  similar  to 
those  resulting  in  the  use  of  s-union  Instead  of  set  union,  the  function 
s-difference  defined  in  Figure  A2.8  is  used  by  Delete-tuples  Instead  of  regular  set 
difference.  The  first  two  examples  specify  the  deletion  of  tuples  equal  to 
tuples  in  the  relations  of  concern.  In  this  case,  s-difference  is  identical  to 
regular  set  difference. 

The  operation 


Delete-tuples(schema6.8,  stated. 8, 


Operate 


operate -.agent 

be  machine '.object 
operate  '.object 

employee 

machine 

name 

number 

type 

names 

serial -numbers 

machine- types 

G.Wayshum 

AB982 

lathe 

(6-11) 


) 


or  in  set  notation 


■MS* 
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Delete-tuples(schema6.8,  state6.8, 

{(Operate,{(({(operate, agent)}, employee), {(name.G.Wayshum)}), 
(({(be  machine, object), (operate.object)}  .machine), 
{(number,AB982),(type,lathe)})})}) 


(6-11) 


results  In  the  relational  state  shown  In  Figure  6.7.  This  Is  equivalent  to  the 
deletion  of  a semantic  unit  In  operation  5-6. 

Figure  6.7  Is  also  the  result  of 


Delete-tuples(schema6.9,  state6.9, 


Employees 


be  employee: object 

employee 

name 

age 

names 

years 

B.Belglm 

35 

or  In  set  notation 

Delete-tuples(schema6.9,  state6.9, 

((Employees, {(({(be  employee,obJect)},employee), 
{(name,B.Belglm),(age,35)})})}). 


(6-12) 


(6-12) 


The  next  example  presents  a situation  In  which  the  heading  In  a tuple 
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specified  for  deletion  causes  the  tuple  to  be  "less  than"  a tuple  in  the  database 
state.  For  the  operation  to  be  legal,  we  must  assume  that  the  constraints 
associated  with  Figure  6.8  do  not  require  that  every  machine  have  an  operator, 
"operation"  can  then  be  considered  to  be  an  independent  association  rather  than 
part  of  a semantic  unit. 

Figure  6.12  is  the  result  of 


Delete-tuples(schema6.8,  state6.8. 


Operate 


operate :agent 

operate  -.object 

employee 

machine 

name 

number 

type 

names 

serial -numbers 

machine-types 

G.Wayshum 

AB982 

lathe 

(6-13) 


) 


or  in  set  notation 


De1ete-tuples(schema6.8,  state6.8, 

{(Operate,((({(operate,agent)),employee),((name,G.Wayshum)}), 
(({(operate, object)), machine), 

{(number,AB982),(type,lathe)})})}). 


(6-13) 


In  this  operation  the  tuple  specified  for  deletion  Is  less  than  the  third 
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name 

age 

names 

years 

T.Manhart 

32 
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G.Wayshum 
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Operate 
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employee 

machine 
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Figure  6.12  - Semantic  Relations  resulting  from  Operation  (6-13) 
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tuple  of  Operate  In  Figure  6.8.  While  the  values  In  the  tuples  are  equal,  the 
predicates  in  the  heading  of  the  tuple  in  the  operation  are  a subset  of  the 
predicates  in  the  relation. 

The  meaning  of  operation  6-13  is  that  the  "operation"  association  between 
G.Wayshum  and  machine  AB982  is  to  be  deleted,  but  the  existence  of  machine 
AB982  is  to  remain  a part  of  the  database  state.  This  should  be  contrasted  with 
operation  6-11  in  which  the  machine  and  association  were  deleted  together. 

The  mechanism  of  such  deletions  is  to  reduce  the  precedence  of  tuples  in 
the  database  state  until  they  become  less  than  or  Incomparable  to  the  tuples 
specified  in  the  deletion.  That  is.  each  assertion  in  the  tuples  specified  for 
deletion  must  be  made  false. 

This  is  done  by  the  introduction  of  null  values.  For  a tuple  in  the 
database  state  which  is  greater  than  a specified  deletion,  null  values  are 
Introduced  for  each  characteristic  which  is  equal  in  value  to  that  in  the 
deletion  and  which  describes  precisely  the  same  predicate  or  set  of  predicate:caae 
pairs. 

Consider  this  rule  applied  to  operation  6-13  on  a column  by  column  basis. 
The  tuple  in  the  deletion  is  less  than  the  third  tuple  in  the  Operate  relation  of 
Figure  6.8.  The  first  columns  have  equal  values,  "G.Wayshum",  and  describe 
the  same  predicate: case  pair,  "operate:agent".  So  for  the  resulting  tuple,  this 
column  will  have  the  null  value. 

The  second  and  third  columns  have  equal  values,  but  do  not  describe  the 
Identical  predicate:case  pairs.  The  tuple  in  the  deletion  describes  "operatetoh/ecf" 
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while  the  tuple  in  the  database  state  describes  both  "operat e:object”  and 
"be  machinerobject".  So  these  two  values  will  not  be  changed.  The  resulting 
tuple  is  the  third  tuple  of  Operate  in  Figure  6.12.  As  shown  in  6-9  and  6-10, 
this  tuple  is  incomparable  to  the  tuple  in  the  delete  operation. 

Figure  6.6  is  the  result  of 


Delete-tup1es(schema6. 1 1 , state6. 1 1 , 


(6-14) 


Supervlse-operate 


supervlseragent 

supervise  -.object 
operate :agent 

operate -.object 

emp 1 oyee 

employee 

machine 

name 

name 

age 

number 

type 

names 

names 

years 

serial-numbers 

machine-types 

6 . Wayshum 

B.Belglm 

35 

— 

— 

or  in  set  notation 


Delete-tuples(schema6. 1 1 , state6. 1 1 , 

{(Supervtse-operate,{(({(supervlse,agent)},employee),{(name,G.Wayshum)}), 

(6-14) 


(({(supervlse,object),(operate,agent)},einployee), 

{(name,B.Belgim),(age,35))), 

(({(operate,object)},machine),{(number,M— "),(typet" — ")})})}). 


This  operation,  which  is  equivalent  to  the  deletion  of  an  Independent  association 


In  operation  5-5,  makes  use  of  null  values.  As  In  the  previous  example,  the 
result  of  the  s-difference  function  is  not  the  same  as  set  difference  because  the 
tuple  in  operation  6-13  is  "less  than"  the  first  row  of  Supervlse-operate  In 
Figure  6.11. 

6.5  Definitions  of  Valid  Operations  and  Valid  States 

As  in  the  case  of  the  semantic  graph  data  model,  we  want  to  define  the 
set  of  operations  for  a given  schema  which  map  database  states  into  database 
states. 

Definition  6.1:  For  a given  r-scheman  c R-scheaa,  the  set  of  valid  operations  is 
the  union  of  the  following  sets  for  1 < i <2: 

{operation  jj  ( r-s  tate:  R-s  tatel 
returns  R-state 

return  Op-typej  (r-ache»an, r-s tate.arg  j I } 

where  argj  c R-atate 
and  for: 

• - Is  Op-typej  - Insert-tuples 
i - 2:  Op-typej  - Delete-tuples 

The  set  of  valid  semantic  relation  database  states  correspond  to  those 
elements  of  the  set  R-atate  which  represent  possible  states  of  the  application. 
As  in  the  case  of  the  semantic  graph  data  model,  we  will  require  that  the 
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initial  state  for  all  semantic  relation  applications  be  the  empty  state. 

Definition  6.2;  The  set  of  valid  semantic  relation  states  for  a given  semantic 
relation  schema  r-schema  and  set  of  operations  R-operat ions,  where  R-operations 
is  defined  for  r-schema  as  in  Definition  6.1,  is  defined  as  follows: 

1.  The  empty  state  r-state  which  maps  each  relation  name  defined  in 
r-schema  into  the  empty  set  is  a valid  semantic  relation  state  if  for  all 
constraints  con  c r-schema.  Con,  con  (r-state)  • true. 

2.  If  r-state  is  a valid  state,  r-op  c R-operations  and  r-op (r-state)  * 

error, 

then  r-state*  - r-op  (r-state)  is  a valid  semantic  relation  state. 
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Chapter  7 - Data  Model  Equivalence:  Definitions 

Now  that  the  semantic  graph  and  semantic  relation  data  models  have  been 
defined,  we  will  begin  the  examination  of  their  equivalence  properties.  Much 
as  Chapter  4 presented  a general  framework  for  data  model  definition,  this 
chapter  gives  definitions  of  equivalence  properties  applicable  to  any  data  models 
defined  in  terms  of  the  Chapter  4 framework.  Examples,  however,  are  drawn 
from  the  semantic  graph  and  semantic  relation  data  models.  An  Informal 
definition  of  database  state  equivalence  for  the  semantic  graph  and  semantic 
relation  data  models  is  also  presented. 

The  notion  of  data  model  equivalence  is  based  on  application  model 
equivalence  which  is  in  turn  based  on  database  state  equivalence.  We  will 
examine  these  in  a "bbttom-up"  manner. 


I 


7. 1 Database  State  Equivalence 

We  would  like  to  say  that  two  database  states  expressed  in  terms  of  two 
different  data  models  are  equivalent  if  they  represent  identical  application  states. 
However,  the  precise  specification  of  state  equivalence  depends  on  the  nature  of 
the  two  data  models.  That  is,  what  is  the  interpretation  of  the  database  state 
in  terms  of  the  application? 

The  ease  with  which  such  interpretations  can  be  given  varies  widely 
between  data  models.  As  we  saw  in  Chapter  3,  attempts  to  define  equivalence 


126  - 


properties  for  syntactic  data  models  have  been  based  on  rather  arbitrary  mapping 
definitions.  The  use  of  semantic  data  models  can  ease  the  task,  of  defining 
equivalent  database  states. 

We  have  claimed  that  many  of  the  semantic  graph  and  semantic  relation 
database  states  shown  in  Chapters  5 and  6 are  equivalent.  We  could  show  this 
by  translating  each  relational  statement  into  a formal  logic  statement  and  then 
showing  that  the  semantic  graph  state  is  a model,  in  the  formal  logic  sense,  for 
the  set  of  logical  statements.  A prerequisite  to  this  would  be  "agreement" 
between  the  semantics  of  the  two  data  models.  Chapters  8,  9 and  10  will 
consider  these  problems  in  detail  for  the  semantic  graph  and  semantic  relation 
data  models. 

For  example,  a mapping  between  association  types  and  the  predicate  used  to 
express  information  concerning  each  association  type  would  be  required 
("supervision”  and  "supervise",  "operation"  and  "operate").  That  is,  there  must  be 
a translation  between  the  natural  language  case  grammars  on  which  the  two 
data  models  are  based.  Hopefully,  the  natural  language  meaning  of  the  words 
would  help  confirm  that  the  mappings  are  done  correctly.  At  this  point  we 
give  a definition  of  state  equivalence  for  semantic  graph  and  semantic  relation 
database  states. 
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Definition  7^1:  Assuming  that  each  semantic  graph  or  semantic  relation  database 
describes  only  identifiable  entities  and  associations,  then  two  database  states. 
each  of  which  is  based  on  either  the  semantic  graph  or  semantic  relation  data 
model,  are  equivalent  if  and  only  if 

1.  For  both  states,  only  one  value  is  specified  for  a single  characteristic 
of  an  entity  or  association. 

2.  For  both  states,  only  one  entity  may  be  specified  as  filling  a given 
role  in  an  association. 

3.  The  states  define  equal  sets  of  entity  and  association  identifiers. 

4.  For  each  entity,  the  same  characteristics  are  defined  in  each  state  and 
are  of  equal  value. 

5.  For  each  association,  the  same  characteristics  are  defined  in  each  state 
and  are  of  equal  value. 

6.  For  each  association,  the  same  roles  are  defined  in  each  state  and  are 
filled  by  the  same  entity. 


Qualifications  1 and  2 of  Definition  7.1  are  aimed  at  semantic  relation  database 
states.  Since  information  may  be  stated  redundantly  (e.g.  in  several  relations), 
this  qualification  guarantees  that  there  will  be  no  inconsistencies  within  the 
database  state. 

As  a basis  for  the  examples  of  this  chapter,  consider  the  semantic  graph 
schema  of  Figure  7.1.  This  is  the  schema  of  Figure  2.1  with  the  addition  of 
the  "supervision"  association.  Figure  7.2  is  a semantic  graph  state  corresponding 
to  this  schema.  Fig  are  7.3  shows  an  "equivalent"  (assuming  proper  constraints) 
semantic  relation  schema  and  state. 
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Figure  7.1  - Semantic  Graph  Schema 
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Figure  7.2  - Semantic  Graph  Database  State 


be  employee  -.object 

employee 

name 

age 

names 

years 

T.Manhart 

32 

C . Gershag 

40 

G .Wayshum 

50 

Employees 


operate:agent 

be  machine: 
operates 

object 

bject 

employee 

machin 

e 

name 

number 

type 

names 

serial -numbers 

machine-types 

T.Manhart 

C. Gershag 

NZ745 

JCL181 

lathe 

press 

Operate 


supervise:agent 

supervi  se  -.object 
operate:agent 

operate  -.object 

employee 

employee 

machine 

name 

name 

number 

names 

names 

serial-numbers 

G. Wayshum 

C. Gershag 

JC1181 

— 

T.Manhart 

NZ745 

Figure  7.3  - Semantic  Relations 
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7.2  Operation  and  Application  Model  Equivalence 

Now  we  will  return  to  the  general  problem  of  defining  data  model 
equivalence  for  any  two  data  models.  Recall  that  a data  model  is  a set  of 
application  models.  We  will  consider  two  data  models 

M = { m- model j,  . . . m- model m) 

and 


N = {n- model],  . . . n- model  n}. 

Our  goal  is  then  to  specify  under  what  conditions  we  can  say  that  Af  Is 
equivalent  to  N. 


It  is  necessary  to  define  application  model  equivalence.  That  is,  how  to 
identify  when  application  models  from  different  data  models  represent  the  same 
application.  We  will  consider  defining  application  model  equivalence  for  two 
application  models  m-modeh  M and  n-modehN.  We  will  assume  that  M-ops 
and  N-ops,  and  M-states  and  N-states  define  the  sets  of  allowable  operations 
and  the  sets  of  valid  states  for  these  two  application  models.  We  will  refer  to 
operations  and  database  states  from  these  sets  as  as  m-op  and  n-op,  and  m-state 
and  n~state,  respectively. 

We  will  assume  that  a state  equivalence  correspondence  has  been  defined 
between  M-states  and  N-states.  That  is,  given  states  m-state  and  n-state,  we 
can  tell  whether  or  not  they  represent  the  same  application  state.  We  require 
that  the  state  equivalence  correspondence  be  onto  both  M-states  and  N-states. 


That  is,  the  "expressive  power"  of  the  state  representation  schemes  are  equal  in 


131 


both  application  models.  ft  also  seems  reasonable  to  assume  that  for  both 
application  models,  some  specific  application  state  Is  represented  by  a unique 
state.  Hence,  we  require  state  equivalence  to  be  a 1-1  onto  correspondence. 

This  does  not  require  that  there  be  a 1-1  mapping  between  components  of 
equivalent  states  as  required,  for  example,  by  the  mapping  of  each  relational 
tuple  to  either  a DBTG  record  or  a set  membership  link  in  [Zimmerman].  The 
specific  manner  of  defining  the  state  equivalence  relation  is  dependent  upon  the 
data  model  being  considered. 

A straight  forward  approach  to  defining  application  model  equivalence  is  to 
require  a 1-1  onto  correspondence,  operation  equivalence,  between  the  sets  of 
operations  M-ops  and  N-ops  such  that  state  equivalence  and  operation 
equivalence  form  an  isomorphism.  We  make  the  assumption  that  the  error 
states  of  all  application  models  are  equivalent. 

First  we  define  operation  equivalence. 

Definition  7.2:  A function 

m •.  M-states  -*  M-states 
is  operation  equivalent  to  a function 

n : N-slates  -*  N-states 

if  and  only  if  for  any  two  equivalent  states  m-statei  M-states  and 
n-state<  N-states,  m(m- state)  is  state  equivalent  to  n(n-state). 

Then  we  define  isomorphic  application  model  equivalence. 


t 
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Definition  7.3:  Two  application  models  m-model  and  n-model  are 

Ispmnrphically  equivalent  if  and  only  if: 

i).  For  every  operation  m-opi  M-ops,  there  exists  exactly  one  n-opiN-ops 
which  is  operation  equivalent  to  m-op. 

and  ii).  For  every  operation  n-op(N-ops,  there  exists  exactly  one  m-opiM-ops 
which  is  operation  equivalent  to  n-op. 

This  is  the  "most  strict"  form  of  application  model  equivalence  we  will 
define.  However,  it  may  in  fact  be  too  strict.  Some  application  models  which 
we  would  like  to  call  equivalent  do  not  satisfy  the  definition.  In  particular, 
consider  the  semantic  relation  and  semantic  graph  data  models.  Suppose  that  in 
the  semantic  relation  data  model  we  consider  an  operation  which  inserts  tuples 
corresponding  to  the  insertion  in  the  graph  model  of  several  independent 
entities.  That  is,  the  relational  operation  is  equivalent  to  the  composition  of 
several  graph  operations. 

To  account  for  this  form  of  equivalence,  we  allow  the  definition  of 
operation  equivalence  to  be  applied  to  the  sets  M-op * and  N-op*  which 
represent  the  set  of  all  compositions  of  operations  in  the  two  application  models. 
This  allows  the  definition  of  composed  operation  application  model  equivalence. 

E I 

Definition  7.4:  Two  application  models  m-model  and  n-model  are  composed 
operation  equivalent  if  and  only  if: 

i).  For  every  operation  m-op(  M-ops,  there  exists  an  operation  mN-ops * 
which  is  operation  equivalent  to  m-op. 

and  ii).  For  every  operation  n-optN-ops,  there  exists  an  operation  miM-ops * 
which  is  operation  equivalent  to  n-op. 


Definition  7.4  does  not  restrict  the  operation  equivalence  correspondence  to 


be  1-1.  It  does  require  that  there  be  an  operation  equivalent  to  each  of  the 
simple  (i.e.  non-composed)  operations  of  M-ops  and  N-ops.  By  composition  of 
simple  operations,  the  correspondence  is  onto  M-ops * and  N-ops* . 

In  practical  terms,  we  would  hope  that  the  operation  equivalence  mappings 
can  be  expressed  as  an  algorithm  rather  than  an  explicit  enumeration  of  an 
extremely  large  number  of  equivalent  pairs.  It  is  such  an  algorithm  which 
would  actually  allow  the  implementation  of  a database  system  which  provides 
users  of  two  different  data  models  with  access  to  the  "same”  data. 

Note  that  based  on  the  types  of  operation  equivalence  introduced  so  far, 
the  translation  of  operations  from  one  application  model  to  an  equivalent 
application  model  can  be  done  independently  of  the  database  state.  That  is, 
operations  are  equivalent  for  all  possible  states.  Consequently,  such  a translation 
Could  be  done  at  "compile-time". 

This  is  not  necessarily  always  the  case.  The  mappings  of  equivalent 
operations  may  be  dependent  upon  the  database  state.  Consider  the  case  of 
adding  to  the  graph  database  state  of  Figure  7.2  a supervision  association 
between  G.Wayshum  and  T.Manhart  resulting  in  Figure  7.4.  The  equivalent 
semantic  relation  database  state  has  the  same  Employees  and  Operate  relations 
shown  in  Figure  7.3  with  the  new  Jobs  relation  shown  in  Figure  7.5, 

The  semantic  relation  operation  equivalent  to  the  stated  graph  operation 
consists  of  the  insertion  of  the  second  tuple  in  the  Jobs  relation  of  Figure  7.5, 
There  are  two  important  things  to  notice  about  this  operation. 

Firstly,  we  did  not  explicitly  delete  the  second  tuple  of  the  Figure  7.3 
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Figure  7.4  - Semantic  Graph  Database  State  after  Insertion 


Jobs  relation.  This  is  because  the  semantic  relation  Insert -tuples  operation  type  is 
defined  (in  Chapter  6)  to  automatically  delete  all  tuples  in  a relation  "less  than" 
those  inserted.  The  partial  ordering  of  tuples  is  based  on  all  non-null  domain 

values  being  greater  than  null  (” ")  and  incomparable  with  any  values  other 

than  null  and  itself. 

More  importantly,  the  values  in  the  added  tuple  are  dependent  upon  the 
database  state  of  Figure  7.3.  Suppose  that  the  semantic  graph  state  of  Figure 
7.4  had  no  operation  association  involving  T.Manhart.  This  would  not  change 
the  graph  operation  needed  to  insert  the  supervision  association  between 
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Figure  7.5  - Semantic  Relation  after  Insertion 


T.Manhart  and  G.Wayshum. 

In  the  case  of  the  relational  model,  the  lack,  of  an  operation  association  for 
T.Manhart  would  change  which  tuple  needed  to  be  added  to  reflect  the  new 
supervision.  The  tuple  to  be  added  would  be  the  second  tuple  in  Figure  7.6 
rather  than  the  second  tuple  in  Figure  7.5.  [Klug  and  Tsichritzis]  have  also 
pointed  out  the  possibility  of  the  state  affecting  the  definition  of  equivalent 
operations. 

We  reflect  the  possibility  of  state  dependence  in  the  following  definitions 
of  state  dependent  operation  and  application  model  equivalences-. 
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Figure  7.6  - Semantic  Relation  showing  State  Dependence  of  Insertion 


Definition  7.5:  A function 

m : M-states  -*  M-states 

and  a function 

n : N-states  -*  N-states 

are  state  dependent  operation  equivalent  for  a given  pair  of  equivalent  states 
m-state:  M-states  and  n-state:  N-states,  if  and  only  if  m(m-state)  is  state 
equivalent  to  n{n-state). 


Definition  7.6:  Two  application  models  m- model  and  n- model  are 

dependent  equivalent  if  and  only  if  for  every  pair  of  equivalent 
m-state:  M-states  and  n-state:  N-states,: 


i).  For  every  operation  m-op:  M-ops,  there  exists  an  operation  mN-ops 
which  is  state  dependent  operation  equivalent  to  m-op  for  m-state  and  n-state. 

and  ii).  For  every  operation  n-op:  N-ops,  there  exists  an  operation  m:  M-ops 
which  is  state  dependent  operation  equivalent  to  n-op  for  m-state  and  n-state. 


The  types  of  application  model  equivalence  defined  are  decreasingly  strict. 


That  is,  isomorphic  equivalence  implies  composed  operation  equivalence,  and 


composed  operation  equivalence  implies  state  dependent  equivalence.  There  are 
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other  possibilities  - we  could  define  state  dependent  equivalence  which  does  not 
allow  composed  operations.  The  intuition  guiding  the  definitions  presented  here 
is  that  as  data  models  become  more  dissimilar,  they  do  so  in  several  ways 
simultaneously.  This  seems  justified  by  the  examples  considered. 

7.3  Data  Model  Equivalence 

Finally  we  reach  the  goal  of  defining  data  model  equivalence. 

Definition  7.7:  Two  data  models  M and  N are  ( isomorphically,  composed 
operation,  state  dependent ) equivalent  if  and  only  if  (isomorphic,  composed 
operation.  state  dependent)  application  model  equivalence  defines  a 
correspondence  between  M and  N which  is  onto  both  M and  N. 

That  is,  if  two  data  models  are  equivalent,  then  for  any  application  model 
in  one  data  mode)  there  is  an  equivalent  application  model  in  the  other  data 
model.  The  expressive  power  of  the  data  models  are  equivalent.  The  definition 
is  dependent  upon  the  type  of  application  model  equivalence  required. 

Partial  equivalence  results  if  when  considering  two  data  models,  there  are 
some  equivalent  application  models,  but  not  all  application  models  In  one  or  both 
data  models  have  equivalent  application  models  in  the  other  data  model.  As  will 
be  seen  in  the  next  chapter,  the  semantic  relation  and  the  semantic  graph  data 
models  are  only  partially  equivalent. 

Note  that  the  definitions  do  not  require  that  application  model  equivalence 
be  1-1.  In  the  case  of  the  semantic  relation  and  graph  data  models,  there  may 
be  several  relational  application  models  state  dependent  equivalent  to  each  graph 
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model.  This  corresponds  to  the  many  different  ways  of  grouping  into  relations 
the  statements  corresponding  to  a single  graph  state.  For  example,  Figure  7.9 
(with  appropriate  constraints)  shows  a single  semantic  relation  which,  like 
Figure  7.3  (consisting  of  three  relations),  is  application  model  equivalent  and 
state  equivalent  to  Figures  7.1  and  7.2.  Such  a property  makes  the  semantic 
relation  model  a good  candidate  for  the  [ANSI]  external  data  model  allowing 
many  different  relational  views  of  a single  semantic  graph  conceptual 
application  model. 


Machine-shop 


supervise : agent 

be  employee: object 
supervise  -.object 
operate :agent 

be  mach1ne:obJect 

operate  -.object 

employee 

employee 

machine 

name 

name 

age 

number 

type 

names 

names 

years 

serial-numbers 

machine-types 

G.Wayshum 

C.Gershag 

40 

JCL181 

press 

— 

T .Manhart 

32 

NZ745 

lathe 

— 

G.Wayshum 

50 

— 

— 

Figure  7.9  - Equivalent  Semantic  Relation 
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Chapter  8 - The  Restricted  Semantic  Relation  Data  Model 


From  the  discussion  and  examples  of  Chapter  7,  it  is  clear  that  the 
semantic  graph  and  semantic  relation  data  models  are  not  isomorphlcally  or 
composed  operation  equivalent.  We  will  state  here  that  they  are  also  not  state 
dependent  equivalent.  While  every  semantic  graph  application  model  has  a 
state  dependent  equivalent  semantic  relation  application  model,  not  every 
semantic  relation  application  model  has  an  equivalent  semantic  graph  application 
model.  The  basic  reason  for  the  existence  of  such  application  models  has  to  do 
with  the  great  freedom  allowed  in  defining  constraints.  While  we  will  not 
develop  the  formal  notation  for  specifying  constraints  until  Chapter  9,  we  will 
still  describe  here  an  example  of  a semantic  relation  application  model  with  no 
semantic  graph  equivalent. 


Consider  the  semantic  graph  application  model  in  Figure  8.1  (identical  to 
Figure  5.1)  and  the  semantic  relation  application  model  (less  constraints)  of 


Figure  8.2  (identical  to  Figure  6.6).  We  have  previously  stated  that  with 
"appropriate"  constraints  these  two  application  models  are  equivalent.  Let  us 
consider  an  "inappropriate"  constraint. 

In  particular,  assume  the  specification  of  a functional  constraint  in  the 
Supervise-operate  relation  of  Figure  8.2  between  the  first  column  (employee 
supervisor  names)  and  the  last  column  (machine  types).  This  constraint  says 
that  the  types  of  every  machine  operated  by  the  employees  of  a given 
supervisor  must  be  equal.  While  this  might  be  a reasonable  restriction,  it 
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Figure  8.1  - More  Complex  Semantic  Graph  Schema 


simply  cannot  be  expressed  by  a semantic  graph  schema.  Note  that  It  is  not 
satisfactory  to  make  "machine-type"  a characteristic  of  employee. 

Firstly,  there  would  not  be  ah  appropriate  value  for  this  characteristic  for 
those  employees  not  supervising  machine  operators.  Secondly,  there  would  be 
no  place  to  record  the  machine  type  for  machines  whose  operators  have  no 
supervisor.  In  light  of  this  problem,  we  would  like  to  identify  those  semantic 
relation  application  models  which  are  state  dependent  equivalent  to  some 
semantic  graph  model.  We  will  do  this  by  giving  algorithms  for  deriving 
semantic  relation  application  models  from  semantic  graph  application  models. 

Constraints  for  the  semantic  relation  data  models  so  derived  will  be  stated 
in  a canonical  form.  Within  the  determination  of  constraint  equivalence,  this 
algorithm  will  define  the  complete  set  of  semantic  relation  application  models 
which  have  semantic  graph  equivalents.  The  set  of  resulting  semantic  relation 
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application  models  will  form  the  restricted  semantic  relation  data  model. 

This  chapter  presents  an  algorithm  for  specifying  the  possible  sets  of 
relations  which  may  form  the  basis  for  semantic  relation  application  models 
state  dependent  equivalent  to  a given  semantic  graph  application  model.  Note 
that  within  the  [ANSI]  framework  this  basic  procedure  could  be  considered  a 
database  design  methodology  for  deriving  multiple  external  semantic  relation 
schemas  corresponding  to  a single  semantic  graph  conceptual  schema. 

Chapter  9 completes  the  specification  of  the  restricted  semantic  relation 
data  model.  The  bulk  of  Chapter  9 is  an  algorithm  for  specifying  the  relational 
constraints  necessary  for  the  restricted  semantic  relation  application  models. 
Formal  definitions  of  the  constraint  types  are  introduced  as  necessary  in  Chapter 
9.  Chapter  10  proves  that  the  semantic  graph  and  restricted  semantic  relation 
data  models  are  state  dependent  equivalent. 

The  rest  of  the  present  chapter  presents  Algorithm  8.1.  Examples  are 
interspersed  with  the  algorithm  definition. 


Algorithm  8,1:  The  following  algorithm  describes  how  to  derive  all  but  the 
constraint  component  Con  of  a relational  schema 

r-schena  - (Re I.  Con,  R-names,  P,  S,  E,  D,  C,  sd,  Str), 
from  a given  semantic  graph  schema 

g-schema  - (E,  A,  D,  r,  c,  fr,  fc,  sd,  R,  C,  Str). 

The  only  restriction  on  g-schema  is  that  no  element  of  A is  of  the  form  be  type 


where  type  < E.  This  is  due  to  the  special  meaning  given  in  the  semantic 
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relation  data  model  to  this  predicate.  If  such  an  association  exists,  we  will 
simply  assume  it  is  renamed  for  purposes  of  the  algorithms  in  Chapters  8-10. 

In  outline,  the  algorithm  is  as  follows: 

I.  Define  an  augmented  graph  schema.  This  adds  the  special  predicates  be  type 
as  nodes  in  the  semantic  graph  schema  g-schema. 

II.  Select  subgraphs  of  the  augmented  schema  defining  that  portion  of  the 
application  which  will  be  described  in  each  relation,  making  sure  that  four 
types  of  requirements  are  satisfied. 

III.  Define  the  relational  schema  components. 

A.  Define  P,  S,  E,  D,  C,  sd  and  Str. 

B.  Define  the  relation  headings  and  names.  Re  I and  R-names,  by  defining  a 
relation  corresponding  to  each  subgraph  defined  in  step  II  as  follows: 

1.  Partition  the  subgraph  corresponding  to  the  grouping  of  the 
columns  by  predicate:case  pairs  as  desired  in  the  relation's  heading. 

2.  Define  the  relation's  heading  for  columns  describing  predicate:case 
pairs. 

3.  Define  the  relation's  heading  for  columns  describing  predicate 
characteristics. 

4.  Combine  the  results  of  steps  2.  and  3. 

5.  Associate  a relation  name  with  the  resulting  heading. 

For  the  reader  desiring  only  a high  level  view,  the  four  requirements  specified 
in  step  II.  which  must  be  satisfied  by  the  subgraphs  (l.e.  the  relations)  should 
be  examined  in  addition  to  the  outline  above. 


The  algorithm  itself  follows: 
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I.  Define  an  augmented  graph  schema.  This  adds  the  special  predicates  be  type 
as  nodes  in  the  semantic  graph  schema  g- schema.  These  nodes  are  added  so  that 
once  transformed  into  relations,  the  existence  of  entities  can  be  stated 
independently  of  their  participation  in  associations. 

Let 

g-schema’  - (E\  A*.  0*.  r\  c\  fr\  fc\  ad*.  R\  C\  Str*), 

where: 


1.  A’  «=  A u ((be  type,  object)  | type  « E) 

This  adds  the  be  type  predicate  tor  each  entity  type  to  the  set  of 
associations. 


2.  r’  « r u (((be  type,  object),  type)  | type  < E) 

This  adds  a role  set  edge  connecting  the  new  association  set  node  to  the 
proper  entity  set  node. 


3.  fr’  » fr  u (((be  type,  object),  (restricted,  mandatory))  | type  t El 

This  states  that  every  entity  of  the  given  type  must  belong  to  exactly  one 
"be  type"  association. 


4.  E*  - E 
6.  c’  - C 

8.  sd*  • 9d 

10.  C*  - C 


5.  D-  - D 
7.  fc*  - fc 
9.  R’  - R 
11.  Str’  - Str 


Figure  8.3  shows  a semantic  graph  schema  identical  to  Figure  2.1.  Figure  8.4.a 
shows  such  an  augmented  schema  for  the  semantic  graph  schema  of  Figure  8.3 
while  Figure  8.4.b  shows  the  augmented  schema  for  the  semantic  graph  schema 
of  Figure  8.1. 
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Figure  8.4  - Augmented  Semantic  Graph  Schemas 
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II.  This  step  defines  a set  SG  of  subgraphs  of  the  augmented  semantic  graph 
schema  g- schema'.  Each  subgraph  so  defined  is  meant  to  specify  some  portion 
of  the  information  describing  the  application  which  the  user  of  the  relational 
database  system  would  like  to  see  displayed  in  a single  relation. 

Informally,  subgraphs  will  be  delineated  by  encircling  a portion  of  the 
augmented  semantic  graph  schema  of  concern.  Formally,  each  subgraph  will  be 
defined  by  specifying  a set  of  paths  Pa  and  a set  of  characteristic  set  edges  Co. 
Each  path  consists  of  a role  set  edge  (connecting  an  association  set  node  and  an 
entity  set  node)  and  a characteristic  set  edge  (connecting  an  entity  set  node  and 
a characteristic  set  node).  We  will  also  require  that  the  entity  set  terminus  of 
the  role  set  edge  is  identical  to  the  entity  set  terminus  of  the  characteristic  set 
edge.  That  is,  the  two  edges  form  a continuous  path  in  the  schema.  We 
specify  a role  set  edge  by  an  element  of  (A’  x R')  and  a characteristic  set  edge 
by  an  element  of  (E*  x C’ ) . So  Pa  is  a subset  of  ( (A*  x R’)  x (E*  x C’)).  A 
sample  path  definable  in  Figure  8. 4. a is  ( (operat  ion, agent) , (empl ogee, name) ) . 

The  set  Co  is  a subset  of  (A'  x C' ) where  each  element  is  a characteristic 
set  edge  which  joins  an  association  set  node  with  a characteristic  set  node.  A 
sample  edge  in  Figure  8.4.b  is  (operation,  cost-per-hour ) . 

As  previously  stated,  each  relation  will  be  defined  as  corresponding  to  a 
particular  subgraph.  Figure  8.5  shows  the  semantic  relations  of  Figure  2.5. 
Figure  8.6  shows  the  two  subgraphs  (of  the  Figure  8.4.a  augmented  semantic 
graph  schema)  corresponding  to  the  two  relations  of  Figure  8.5.  Figure  8.6.a 
shows  the  "view"  of  the  application  shown  in  the  Employees  relation  of  Figure 
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8.5  while  Figure  8.6.b  shows  the  "view"  of  the  application  shown  in  the 
Operate  relation.  The  paths  in  Pa  and  edges  in  Ca  are  shown  as  heavy 
numbered  lines  alongside  the  edges  which  make  up  the  paths. 

Figure  8.7  shows  the  four  subgraphs  (of  the  Figure  8.4.b  augmented 
semantic  graph  schema)  corresponding  to  the  four  relations  of  Figure  8.2. 

Figures  8.7.a,  b,  c and  d correspond  respectively  to  the  relations  Locations-used, 

Machine-age.  Operation  and  Supervise-operate. 

To  ensure  that  the  set  of  relations  corresponding  to  the  specified  set  of 
subgraphs  completely  describe  the  application  (and  hence  is  equivalent  to  the 
semantic  graph  schema),  several  requirements  must  be  fulfilled  by  the  choice  of 
subgraphs.  One  subgraph  may  be  used  to  fulfill  any  number  of  requirements. 

These  four  requirements  are: 

1.  Existence  and  Identity.  For  each  association  set  node  a c A*  there 
exists  at  least  one  subgraph  sg  < SC  such  that: 

a.  For  each  role  r such  that  fr*  (o.r)  .restr  ict  « restr  icted,  for 
each  characteristic  c where  fc’(e.c)  « key  and  e = R'io.r),  there  is 
a path  ( (a,r) , (e.c) ) in  the  set  Pa  of  sg. 

b.  For  each  characteristic  c such  that  fc’(o.c)  - key,  there  is  a 
characteristic  edge  (o.c)  in  the  set  Co  of  sg. 

This  requirement  implies  that  for  each  association,  at  least  one  relation  will 

contain  all  of  the  characteristics  needed  to  uniquely  identify  the  association. 

This  requirement  also  implies  that  for  each  entity  type,  there  will  be  at  least 

one  relation  with  the  predicate  be  tgpe  which  has  all  of  the  key  characteristics 


of  the  entity  type. 
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Figure  8.5  - Complete  Semantic  Relation  State 
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a.  Employees  Relation 


b.  Operate  Relation 


Figure  8.6  - Relational  Views  Corresponding  to  Figure  8.5 
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supervision  dollars  serial -numbers  be  machine 


Figure  8.7  - Relational  Views  corresponding  to  Figure  8.2 
(Continued  on  Next  Page) 
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c.  Operations  Relation 


Figure  8.7  - Relational  Views  corresponding  to  Figure  8.2 
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2.  Non-identifying  Association  Characteristics.  For  each  association  set 
node  a c A',  for  each  characteristic  c,  such  that  fcMo.c)  - non-key,  there 
exists  at  least  one  subgraph  sg  < SC  such  that  sg  satisfies  the  conditions 
of  requirement  1.  above  for  a and  the  characteristic  edge  (o.c)  is  in  the 
set  Co  of  sg. 

This  requirement  implies  that  for  each  association  which  has  non-key 
characteristics,  there  will  be  a relation  for  each  non-key  characteristic  which 
contains  the  characteristic  along  with  all  characteristics  needed  to  uniquely 
identify  the  association. 


3.  Non-identifying  roles.  For  each  association  set  node  At  A’,  for  each 
role  r,  such  that  fr*  (o,r)  .restrict  = unrestricted,  there  exists  at  least 
one  subgraph  sg  c SC  such  that: 

a.  sg  satisfies  the  conditions  of  requirement  1.  above  for  a. 

b.  For  each  characteristic  c such  that  fc’ (e.c)  - key  where  e = 
R'(ff.r),  the  path  ((o.r),  (e.c))  is  in  the  set  Pa  of  sg. 

This  requirement  implies  that  the  characteristics  which  identify  each 

unrestricted  role  entity  will  be  in  at  least  one  relation  which  identifies  the 

association  to  which  the  entity  belongs. 


4.  Non-identifying  Entity  Characteristics.  For  each  entity  set  node  e c E’ , 
for  each  characteristic  c,  such  that  fc’(e.c)  - non-key,  there  exists  at 
least  one  subgraph  sg  such  that: 

a.  The  path  ((be  e,  object),  (e.c))  is  in  sg. 

b.  For  each  characteristic  c*  such  that  fc*  (e.c*)  - key,  the  path 
((be  e,  object),  (e.c*))  is  in  sg. 

This  requirement  Implies  that  for  each  entity  set  type  there  will  be  relations 


which  associate  each  non-Key  characteristic  with  the  identifying  characteristics 


of  the  entity. 
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As  a simple  example,  we  will  examine  how  the  subgraphs  of  Figure  8.6 
satisfy  the  four  requirements. 

1.  Existence  and  Identity.  There  are  three  elements  of  A’,  be  employee, 
be  machine  and  operation.  be  employee  has  one  restricted  role  object  which 
has  one  Key  characteristic  name.  So  path  1 of  Figure  8.6.a  satisfies  this 
requirement  for  be  employee.  Similarly,  path  4 of  Figure  8.6.b  satisfies  the 
requirement  for  be  machine. 

operation  has  the  restricted  role  object  with  key  characteristic  number. 
Path  2 of  Figure  8.6.b  satisfies  this  requirement  for  this  association  set  node. 

2.  Non-identifying  Association  Characteristics.  None  of  the  association  nodes  In 
Figure  8.6  have  characteristics,  so  this  requirement  is  trivially  satisfied.  (For 
the  schema  of  Figure  8.7,  path  2 in  conjunction  with  operation  identifying 
paths  3,  4 and  5 of  Figure  8.7.c  satisfies  this  requirement  for  the  non -key 
characteristic  cost-per-hour  of  the  operation  association.) 

3*  Non-identifying  Roles.  The  key  characteristic  name  of  the  unrestricted  role 
agent  of  operation  is  included  in  path  1 of  Figure  8.6.b.  Hence  the 
requirement  is  satisfied  since  this  subgraph  also  has  path  2 which  satisfies 
requirement  1 for  operation. 

4.  Non-identifying  Entity  Characteristics.  The  non-key  characteristics  age  of 
employee  and  type  of  machine  are  included  in  path  2 of  Figure  8.6.a  and  path  5 
of  Figure  8.6. b respectively.  This  requirement  is  then  satisfied  since  path  1 
((be  employee,  object),  (employee,  name))  is  also  in  Figure  8.6.a  and  path  4 


/ 
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((be  m.'ic.h i ne , object),  (machine,  number))  is  also  In  Figure  8.6.b. 

HI.  Define  the  relational  schema  components. 

A.  Define  P,  S,  E,  0,  C,  sd  and  Str.  This  step  Involves  the  definition  of 
the  aforementioned  relational  schema  components  and  appropriate 
translation  functions  between  the  components  and  the  corresponding  graph 
schema  components. 

Formally,  within  the  constraints  to  be  stated  on  the  translation 
functions,  the  choice  of  elements  for  these  components  are  arbitrary.  But 
it  is  the  choice  of  these  translation  functions  which  will  determine  the 
ease  of  semantic  interpretation  of  the  relational  schema  in  terms  of  the 
graph  schema.  That  is,  arbitrary  mappings  would  not  be  very 
understandable. 

There  must  be  a translation  function  Tp:A*  -*  P.  This  function  must 
be  a 1-1  function  onto  P - Ibe  e|e<Et.  This  maps  each  association  set 
node  onto  a corresponding  predicate. 

Ts:R’  -•  S is  a 1-1  onto  function  which  maps  each  role  name  onto  a 
corresponding  semantic  case.  Note  that  for  the  examples  of  this  thesis,  Te 
has  always  been  chosen  to  be  the  identity  function. 

There  must  also  be  1-1  onto  functions  Te,  Td,  Tc  and  Tstr  which 
map  the  E’t  O’,  C*  and  Str*  components  of  g-schema’  onto  the 
corresponding  components  of  r-schema.  For  notatlonal  convenience  (and  a 
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suggestion  for  a "preferred"  function),  we  will  leave  exact  mention  of 
these  functions  out  of  the  algorithms  of  this  and  following  chapters 
assuming  the  identity  function  for  each  of  these.  They  would  be 
included,  if  desired,  in  the  same  manner  in  which  Ts  will  be  used,  sd  of 
r-schema  must  be  defined  such  that  domain  membership  is  preserved  by  Td 
and  Tstr. 

B.  Given  a set  SG  of  subgraphs  of  g-9chema',  define  the  relation  heading 
and  names  schema  components.  Re  I and  R-names.  For  each  subgraph 
sg  c SG  define  a relational  heading  as  follows: 


1.  Partition  the  subgraph  corresponding  to  the  grouping  of  the 

columns  by  predicate:case  pairs  as  desired  in  the  relation's  heading. 

Formally,  partition  the  set  of  paths  Pa  of  sg  such  that: 

a.  For  any  two  paths  ((c.r),  (e.c))  and  ((e'.r'l,  (e'.c*)) 
in  the  same  partition: 

1.  a >*  o' 

ii.  e - e' 

iii.  c«c'  implies  there  also  exists  paths  Ma.r), 
(e.c'))  and  Me'.r'),  (e'.c))  in  the  partition. 

iv.  C(e,c)  « C(e' ,c') 


b.  For  any  two  association  set  nodes,  o and  a',  if  every 
partition  which  contains  a path  terminating  on  a also  contains 
a path  terminating  on  a',  then  a must  be  the  association  be  e 
for  some  entity  set  e. 


This  step  defines  the  column  grouping  of  the  relation,  l.e.  which 


predicate:case  pairs  will  be  grouped  together.  For  example,  referring 
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to  the  numbered  paths  in  Figure  8.6. b,  the  Operate  relation  is  defined 
by  ^the'  partition  {{1},{2, 3,4,5}}.  This  specifies  that  the  name 
characteristic  of  operate: agent  will  form  one  column  group  while  the 
number  and  type  characteristics  for  be  machine: object  and 
operate: object  are  grouped  together.  For  the  Employees  relation  of 
Figure  8.5,  the  partitioning  of  the  paths  of  Figure  8.6.a  is  simply 
{{1,2}}.  There  is  only  a single  column  group.  The  edges  Ca 
specifying  characteristics  of  associations  will  be  grouped  in  step  3 
below  according  to  the  association  described. 

Restriction  a.  above  implies  that  every  predicate:case  pair  in  the 
same  column  group  is  defined  for  the  same  entity  type  and  the  same 
characteristics.  It  is  also  required  that  two  cases  of  the  same 
predicate  cannot  be  in  the  same  column  group. 

Requirement  b.  above  says  that  for  any  two  predicates  other  than 
be  e,  there  must  be  at  least  one  case  for  each  predicate  which  is  in  a 
column  group  without  the  other  predicate.  This  allows,  through  the 
use  of  null  values,  the  existence  of  tuples  which  state  that  an 
association  of  one  type  exists  while  an  association  of  the  other  type 
does  not  exist,  be  e is  an  exception  because  It  is  mandatory  for  every 
entity  type. 

The  partitioning  of  the  paths  fa  in  Figure  8.7  to  define  the 
columns  describing  predicate:case  pairs  in  Figure  8.2  is  as  follows: 
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Relation  (Figure  8.2) 

Path  Figure 

Partition 

Locations-used 

8.7.a 

{{2}} 

Machine-age 

8.7.b 

{{ 1.2,3}} 

Operations 

8.7.c 

{{1},{4.5,6,7,8,9}> 

Supervise-Operate 

8.7.d 

{{5},{1.2t3,4.6,7},{8,9}) 

2.  Define  the  relation's  heading  for  columns  describing  predicate.-case 
pairs.  That  is,  with  each  element  of  the  partition  defined  in  the  step 
above  associate  an  element  of  type  H (top  level  grouping)  where  this 
element  is 

(Hp.s)  | (o.r)  is  the  first  element  of  a 

path  in  the  partition  element 
and  (p,s)  = (Tp (o) , Ts(r))l,  e) 

where  e is  the  entity  set  on  which  all  paths  of  the  partition  element 
terminate. 

This  specifies  the  set  of  predicatercase  pairs  and  the  entity  type 
which  form  the  top  level  heading  of  this  relation  column  group. 
With  each  element  of  H so  defined,  associate  a function  cj  where  the 
domain  of  cj  is  the  set  of  characteristic  names  which  label 
characteristic  edges  in  the  paths  of  the  partition.  c / is  defined  so 
that  it  maps  each  of  these  characteristic  names  into  the  same  domain 
name  which  labels  the  terminus  of  the  characteristic  edge  defined  by 


fc\ 
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This  specifies  the  various  characteristics  and  their  domains  which 
describe  the  column  group's  predicate:case  pairs.  For  the  partition  of 
Figure  8.6. b,  two  such  elements  of  H would  be  defined  - one  for  each 
element  of  the  partition.  They  would  be  ( ((operate, agent)  I , 
employee)  and  (I  (be  machine, ob ject) , (operate, object)  I , machine). 
The  function  mapping  name  into  names  would  be  associated  with  the 
first  element.  The  function  mapping  number  into  ser  i a I -numbers  and 
type  into  machine-types  would  be  associated  with  the  second  element. 

3.  Define  the  relation's  heading  for  columns  describing  predicate 
characteristics.  For  each  association  a which  has  characteristic  edges 
in  the  set  Co  of  sg,  define  an  element  A of  type  H such  that  h - 
Tp (o) . 

Associate  with  h a Char  function  c / such  that  the  domain  of  cf  is 
the  set  of  characteristic  names  occur  lng  in  the  association's 
characteristic  edges  in  the  partition.  And  let  it  map  each  such 
characteristic  name  into  the  domain  name  defined  for  the  edge  by  fc’. 

This  specifies  the  characteristics  and  domains  describing  the 
predicates  in  the  relation. 

4.  Combine  the  results  of  steps  2.  and  3.  Define  the  relational 
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be  the  union  of  those  elements  of  H defined  in  steps  2.  and  3.  above. 
Let  heading  map  each  such  element  of  H into  the  function  associated 
with  the  element  as  defined  in  steps  2.  and  3.  above. 

5.  Associate  a relation  name  with  the  heading  defined  in  step  4. 

The  above  steps  carried  out  for  each  subgraph  completely  define  the 
relations  in  the  schema.  The  function  mapping  the  name  into  its  heading 
defines  the  Re  I component  of  the  relational  schema  r-schema.  The  domain 
of  Re  I is  the  R-nantes  component. 

Continuing  with  the  examples  previously  used,  these  steps  would 
result  in  the  relational  headings  of  Figures  8.5  and  8.2. 

End  of  Algorithm  8.1 

Note  that  the  following  are  steps  in  Algorithm  8.1  in  which  there  is  the 
possibility  of  choice: 

1.  The  choice  of  subgraphs  in  step  II. 

2.  The  choice  of  translation  functions  in  step  III.A. 

3.  The  choice  of  path  partitioning  in  step  III.B.l. 

4.  The  choice  of  relation  names  in  step  I1I.B.5. 

For  a given  set  or  choices  and  resulting  relational  schema  components.  Chapter  9 
will  present  an  algorithm  for  constraint  definition. 


( 
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Chapter  9 - The  Restricted  Semantic  Relation  Data  Model: 

Constraints 

This  chapter  deals  with  the  constraint  component  of  the  semantic  relation 
data  model.  To  show  the  motivation  for  the  various  kinds  of  constraints,  they 
are  introduced  in  a "structured"  manner.  In  particular,  they  are  Introduced  in 
terms  of  the  algorithm  defining  constraints  for  the  restricted  semantic  relation 
data  model.  In  doing  so,  the  form  of  the  constraints  for  the  "unrestricted" 
semantic  relation  data  model  is  also  specified.  While  the  definitions  of  the 
relational  state  and  operations  were  relatively  simple,  the  complexity  of  the 
relational  model  resides  in  the  constraint  definition  component  of  the  schema. 

The  semantic  relation  constraints  are  meant  to  be  in  the  "spirit"  of  the 
types  of  constraints  proposed  for  the  syntactic  relational  model  [Codd  70]. 
Constraints  are  predicates  on  the  database  state  (i.e.  functions  evaluating  to  true 
or  false).  In  addition  to  applying  these  predicates  directly  to  the  database 
relations,  the  predicates  may  also  be  applied  to  relations  which  can  be  defined 
in  terms  of  the  database  relations  via  a "relational  algebra". 

It  should  be  emphasized  that  this  approach  is  taken  here  not  because  it 
was  necessarily  considered  a "good"  approach.  Rather,  it  is  done  to  provide  a 
vehicle  for  studying  an  application  of  the  general  framework  for  equivalence 

problems  and  to  explore  the  characteristics  of  a semantic  version  of  the 

Jf 

relational  model.  In  fact,  as  will  be  seen,  the  kinds  of  constraints  presented 
here  seem  to  be  rather  poor  when  judged  on  the  criterion  of  understandability 
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to  the  user. 


9. 1 The  Semantic  Relational  Algebra 

For  the  semantic  relation  data  model,  the  relational  algebra  is  defined  by  a 
set  of  functions  which  map  pairs  consisting  of  a relational  heading  and  a set  of 
tuples  into  a new  relational  heading  and  set  of  tuples.  We  call  these  the 
constraint  operation  types.  The  basic  difference  between  these  operation  types 
and  the  Insert-tuples  and  Delete-tuples  operation  types  is  that  the  constraint 
operation  types  create  new  relational  headings.  That  is,  they  define  a new  form 
for  the  relational  table.  Insert-tuples  and  Delete-tuples,  on  the  other  hand,  only 
change  the  state  of  the  relation  - the  set  of  tuples  in  the  relation. 

The  constraint  operations  which  create  new  relational  headings  are  used 
only  in  specifying  constraints  on  the  state  of  the  database.  They  would  not  be 
used  by  the  database  system  user  to  create  new  relations.  While  this  ability  to 
dynamically  create  new  relations  would  be  a worthwhile  feature,  it  presents  a 
data  model  too  complex  for  the  purposes  of  this  thesis.  The  constraint  operation 
types  are  Pioject,  Case-join,  Predicate-join  and  Conjunction.  The  constraint  operation 
types  and  constraint  types  are  formally  defined  in  Appendix  3 (relying  on  the 
definitions  in  Appendix  2).  Each  constraint  and  constraint  operation  type  will 
be  explained  in  detail  as  it  is  needed  by  the  restricted  semantic  relation  data 
model  constraint  definition  algorithm. 


The  operations  are  semantic  counterparts  of  relational  operations  defined  for 
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syntactic  databases  as  in  [Codd  70],  [Chen]  points  out  that  unrestrained  use  of 
the  syntactic  operations  in  [Codd  70]  can  lead  to  misunderstandings  and 
ambiguities  when  trying  to  give  semantic  interpretations  to  the  syntactic 
relations.  On  the  other  hand,  the  operations  presented  here  preserve  the 
semantics  of  the  arguments  to  the  operations. 

9.2  Constraint  Types 

We  allow  five  types  of  constraints.  Each  constraint  type  is  a function 
which  evaluates  to  a Boolean  value.  The  types  of  constraints  are  Equal,  Equate, 
Subset.  Non-null  and  Col-func.  These  types  of  constraints  are  sufficient  for  the 
purpose  of  defining  relational  application  models  equivalent  to  semantic  graph 
application  models.  It  is  not  intended  that  they  be  as  "expressive"  as  any  given 
user  might  desire.  More  general  forms  of  constraints  have  been  proposed  for 
syntactic  relational  models  ([Hammer  and  McLeod]  and  [Stonebraker]). 

It  is  conjectured,  though  not  proved,  that  with  one  exception  the 
constraint  types  and  constraint  operation  types  are  all  necessary  to  define 
constraints  such  that  each  semantic  relation  application  model  with  schema 
components  derived  by  Algorithm  8.1  is  equivalent  to  the  semantic  graph 
application  model  from  which  it  was  derived.  The  exception  is  that  an  instance 
of  ihe  Equate  constraint  can  be  expressed  as  two  instances  of  the  Subset 
constraint.  "Necessary"  is  meant  in  the  sense  that  if  any  one  constraint  or 
constraint  operation  type  were  not  allowed,  those  remaining  would  not  be 


i 
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sufficient.  This  does  not  rule  out  some  other  set  of  sufficient  constraints  and 
operations. 

Formally,  the  set  of  allowed  constraints  in  the  unrestricted  semantic 
relation  data  model  are  defined  as  follows.  Those  of  the  restricted  semantic 
relation  data  model  form  a subset  of  the  constraints  in  the  unrestricted  data 
model. 


Definition  9.J_:  The  constraints  in  a semantic  relation  schema  must  be  functions 
of  the  form: 


constraint  ( rei:  (R-names-*Heading) , *tote:R-state) 
returns  Boolean 

return  con  type^  (argj , arg2,  . . . argn) 
wher  e 1 < i <5 


and  for 

i • 1 : 

con-type,  - Equal 

argj  ( 

(Heading 

X 

Relation) 

arcj2  < 

(Heading 

X 

Relat ion) 

for 

i - 2: 

con -type,  » Subset 

argj  c 

(Heading 

X 

Relat ion) 

arg2  t 

(Heading 

X 

Relation) 

for  i 

i « 3: 

con-typej  = Equate 

argj  e 

(Head i ng 

X 

Relat ion) 

arg 2 « 

(Heading 

X 

Relation) 

for  i 

1 - 4: 

con-type,  - Non  null 

arcj]  ( 

(Head i ng 

X 

Relat ion) 

for  i 

• 5: 

con-type,  - Col-func 

argj  c 

Heading 

arg2  c 

Head i ng 

arg3  ( 

(Heading 

X 

Relation) 

For  brevity,  we  will  express  these  constraints  in  this  thesis  using  the 
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following  notation: 

con-type,  [arg|.arg2.  . . . argnl . 

The  correspondence  between  this  notation  and  the  functional  form  of 
Definition  9.1  should  be  clear. 

We  will  also  express  those  argj  t (Heading  x Relation)  in  terms  of 
compositions  of  the  constraint  operations  applied  to  the  relations  in  the  database 
state.  When  stating  the  name  r of  a database  relation  in  such  an  expression, 
we  are  referring  to  the  pair  consisting  of  the  relational  heading  from  the 
schema  (Re  Mr))  and  the  relational  state  (R-state(r) ). 

Figure  9.8.  located  at  the  end  of  this  chapter,  presents  the  constraints 
associated  with  Figure  8.5.  Figure  9.9,  also  located  at  the  end  of  this  chapter, 
presents  the  constraints  associated  with  Figure  8.2.  It  can  be  seen  that  a fair 
number  of  constraints  are  needed  even  for  a database  with  a small  number  of 
relations.  Each  of  the  constraints  in  Figure  9.8  will  be  identified  with  the  step 
in  the  algorithm  which  mandated  its  definition. 
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9.3  The  Constraint  Definition  Algorithm 
Algorithm  9.1: 

Consider  a semantic  graph  schema 

g-schema  - (E.  A.  D.  r,  c,  fr,  fc.  sd,  R,  C,  Str) 
and  a semantic  relation  schema 

(--schema  - (Rel,  Con,  R-names,  P,  S,  E,  D.  C,  sd,  Str). 

If  the  semantic  relation  schema  components,  except  for  the  constraint  component 
Con,  were  derived  from  g-schema  by  Algorithm  8.1  and  if  the  Con  component  is 
derived  by  the  algorithm  to  follow,  then  r-schema  is  a schema  in  the  restricted 
semantic  relation  data  model. 

Our  goal  is  to  define  constraints  so  that  database  states  based  on  r-schema 
are  always  equivalent  to  those  based  on  g-schema.  Definition  7.1  defines  the 
items  of  concern  for  such  an  equivalence.  Accordingly,  the  outline  of  this 


algorithm  is: 
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I.  Define  base  expressions  in  terms  of  the  database  relations  so  that  there  is  a 
base  relation 

-for  each  entity  type  identifier 
-for  each  association  type  identifier 

-for  each  non-identifying  characteristic  of  an  association  type 
-for  each  non-identifying  role  of  an  association  type 
-for  each  non-identifying  characteristic  of  an  entity  type. 

A.  Identify  portions  of  the  semantic  graph  schema  for  each  base 

expression. 

B.  Define  the  base  expressions. 

II.  Define  constraints  on  those  portions  of  the  database  identified  by  the  base 
expressions  so  that: 

A.  All  alternative  choices  for  base  expressions  are  equal. 

B.  Each  entity  has  only  a single  value  for  each  non-identifying 
characteristic. 

C.  Each  association  has  a single: 

1.  Value  for  each  non-identifying  characteristic. 

2.  Entity  filling  a role  for  non-identifying  roles. 

D.  Roles  are  only  filled  by  existing  entities  for  both  cf: 

1.  Identifying  roles. 

2.  Non-identifying  roles. 

E.  Mandatory  role  requirements  are  fulfilled. 

F.  All  characteristics  of  entities  and  associations  are  assigned  values. 

III.  Define  each  database  relation  in  terms  of  base  expressions. 

A.  Build  the  expression  for  each  relation  to  include: 

1.  Desired  characteristics  oi  each  entity  in  the  relation. 

2.  Desired  characteristics  and  roles  of  associations  in  the  relation. 

3.  The  combination  of  information  concerning  a single  predicate. 

4.  The  combination  of  different  predicates  into  a single  relation. 

B.  Define  the  constraint  enforcing  the  composition  of  base  relations  into 
the  database  relation. 


The  detailed  algorithm  itself  follows. 
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I.  The  base  expressions  can  be  considered  to  define  the  "basic  facts"  about  the 
application.  One  such  base  expression  is  defined  for  each  of  the  requirements 
on  subgraphs  of  the  augmented  semantic  graph  schema  as  specified  previously  in 
step  II  of  Algorithm  8.1. 

A.  Requirements  1 to  4 in  step  II  of  Algorithm  8.1  state  that  certain  paths 
and  characteristic  edges  must  be  present  in  some  subgraph.  For  each  such 
required  set  of  paths  and  characteristic  edges  we  define  a subgraph  of 
this  subgraph.  This  sub-subgraph  would  consist  of  the  minimal  set  of 
paths  and  edges  which  satisfy  the  stated  requirement.  In  the  case  where 
there  are  several  such  sub-subgraphs,  any  arbitrarily  chosen  sub-subgraph 
is  used.  (In  any  case,  any  such  sub-subgraph  would  specify  an  identical 
subgraph  of  the  total  augmented  semantic  graph  schema.) 

For  the  relations  of  Figure  8.5,  the  sub-subgraphs  are  shown  in 
Figure  9.1  as  subgraphs  of  the  graphs  in  Figure  8.6  which  define  the 
relational  "views"  of  Figure  8.5.  The  requirements  to  be  satisfied  by  the 
sub-subgraphs  can  be  informally  restated  as: 


: 
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be  employee 
object 


be  machine 
♦ 


object 


i _ 1 

employee  <--9g-^  (operation  machine  } 

name^,/^  '^X^age  \.  numbe^^  >^type 

namcs  years  \ser  la  1 -numbers/7  machine-types 

a. 


be  employee 
* 


be  machine 

object 


object 

employee  <-39501.  operation  i-P^ECt^/chlne 
name^/^  NS'\^age 
names  years 


number/X  A type 
\ 

serial  lumbers  /machine-types 


c. 


Figure  9.1  - Sub-subgraphs  of  Figure  8.6 
(Continued  on  Next  Page) 
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1.  Existence  and  Identity.  For  each  association  type,  there  is  a 
sub-subgraph  which  contains  the  minimal  set  of  edges  and  paths 
which  form  the  identifying  characteristics  of  the  association  type. 

For  the  example  pertaining  to  Figure  8.5,  the  corresponding 
sub-subgraphs  are  shown  (as  subgraphs  of  Figure  8.4.a)  in  Figures 
9.1. a,  b,  and  c for  the  association  types  operation,  be  employee  and 
be  machine,  respectively. 

2.  Non-identifying  Association  Characteristics.  For  each  non-key 

characteristic  of  an  association  type,  there  is  a sub-subgraph  which 
contains  the  corresponding  characteristic  edge  and  also  all  of  the  paths 
and  edges  needed  to  Identify  the  association. 

None  of  the  association  types  in  Figure  8.4.a  have  characteristics, 
so  there  are  no  such  sub-subgraphs  of  Figure  8.4.a. 

3.  Non-identifying  Roles.  For  each  unrestricted  entity  role  in  an 

association  type  (i.e.  those  which  are  not  used  to  identify  the 
association),  there  is  a sub-subgraph  which  contains  the  key 

characteristic  paths  of  this  role  and  also  the  identifying  paths  and 
edges  of  the  association. 

For  the  Figure  8.5  example,  Figure  9.1.d  shows  the  sub-subgraph 
for  the  only  unrestricted  entity  role,  the  agent  role  of  the  operation 
association. 

4.  Non-identifying  Entity  Characteristics.  For  each  non-key 

characteristic  of  an  entity  type,  there  is  a sub-subgraph  which 
consists  of  a path  from  the  be  type  association  node  to  each  key 

characteristic  edge  and  a path  from  the  be  type  node  to  the  non-key 
characteristic  node. 

Figures  9. I.e  and  f show  the  sub-subgraphs  for  the  non-key 
characteristic  age  of  employee  and  the  non-key  characteristic  type  of 
machine  respectively. 

Note  that  each  sub-subgraph  shown  in  Figure  9.1  is  contained  In  one 
of  the  subgraphs  shown  in  Figure  8.6. 
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B.  Given  the  specification  of  the  sub-subgraphs  as  paths  and  edges  of 
specified  subgraphs,  the  relational  expressions  defining  the  base  relations 
can  be  specified.  This  Is  done  by  first  following  step  III.B  of 
Algorithm  8.1  for  each  sub-subgraph  to  define  relational  headings.  If  for 
sub-subgraphi  which  is  a subgraph  of  subgraph  j corresponding  to 
Relation j,  we  thusly  define  heading we  then  define  the  base  relational 
expression  for  this  sub-subgraph  as 

Baserelation t - Project!  heading j,  Relation  j 1. 

Figure  A3.1  defines  the  Project  constraint  operation.  The  operation 
allows  the  definition  of  relations  which  consist  of  statements  directly 
implied  by  the  argument  relation.  This  Is  done  by  eliminating  columns  of 
the  relation  and/or  eliminating  certain  of  the  predicate:case  pairs  from  the 
heading  of  the  relation  for  those  columns  which  have  more  than  one 
predicate:case  pair. 
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For  example,  using  the  Operations  relation  from  Figure  8.2, 

Project  ( 

, Operations) 

or  in  set  notation 

Project({(({(be  machine,object)},machine), 

{(number  ,serial-numbers),(type, machine-types)))}. 

Operations) 

results  in  the  Be -machine- base  relation  shown  in  Figure  9.2.  Note  that 
columns  have  been  eliminated  and  that  the  predicate:case  pair 
operate: oh ject  has  been  eliminated  from  the  retained  columns.  Yet  the 

I 

meaning  of  the  relation  is  still  apparent. 

‘ 

It  is  assumed  that  the  relational  arguments  to  Project,  as  well  as  the 
other  constraint  operation  types,  are  in  proper  form.  That  is,  the  tuples 
are  in  the  form  dictated  by  the  accompanying  heading.  This  would  always 
be  the  case  when  all  arguments  are  based  on  the  well-formed  database 
relations. 

The  first  part  of  Figure  9.8  defines  the  base  expressions  corresponding 
to  the  Figure  8.5  example.  Note  that  the  Employee-char -base  base 
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Be-machine-base 


be  machine:objcct 

machine 

number 

type 

serial -numbers 

machine-types 

NZ745 

lathe 

JCL181 

press 

Figure  9.2  - Result  of  Project  Operation 


expression  is  simply  identified  with  the  Employees  relation  of  Figure  8.5. 
This  uses  the  first  simplification  rule  shown  in  Figure  9.3.  Other 
simplification  rules  to  be  used  are  also  in  Figure  9.3.  The  "completeness" 
of  these  rules  will  be  discussed  later. 



1.  Project(h,  (h,r))  = (h,r) 

2.  Case-jo1n(c,  c, 

Case-Join(c' , c,  ( h | , rj),  (h2,  r2)), 

(h2,  r2 ) ) 

= Case-Join(c ' , c,  (hp  rj),  (h2,  r2)) 

If  Project(h,  (h\  r j ) ) = (h,  r2) 
then  Case-Jo1n(c,  c,  (h'.rj),  (h,r2))  = (h'.rj) 

where  h,h',hj,h2  c Heading 
c,c'  ( Case-col 
r,rj,r2  < Relation 


Figure  9.3  - Some  Equivalence  Rules 
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We  have  been  using  the  term  expression  to  emphasize  that  base 
expressions  simply  identify  a portion  of  the  relation  in  the  database  state. 
They  do  not  identify  relations  which  are  visible  to  users  of  the  database 
system.  They  are  used  only  to  define  the  semantic  constraints  on  the 
database  state. 

However,  given  this  understanding,  we  can  now  refer  to  base 
relations.  A base  relation  specifies  the  result  of  evaluating  a base 
expression  in  a given  database  state.  As  in  the  case  of  base  expressions, 
base  relations  are  only  used  to  define  the  relational  constraints.  They  need 
not  actually  ever  "exist"  or  "be  produced"  in  the  implementation  of  a 
relational  system. 

II.  Given  the  definitions  of  the  base  expressions,  we  then  can  formally  define 
the  set  of  constraints.  Some  constraints  will  define  certain  properties  required 
of  the  base  relations  while  others  will  describe  how  each  database  relation  can 
be  expressed  as  some  composition  of  the  base  relations. 

Of  course,  since  the  base  relations  are  defined  as  projections  of  the  database 
relations,  the  end  result  is  that  all  of  the  constraints  in  fact  specify  properties 
of  the  database  relations  actually  seen  and  operated  on  by  the  user.  The  base 
relations  are  used  as  a method  of  structuring  the  constraints  and  will  be  used 
in  Chapter  10  to  define  equivalent  operations  between  the  semantic  graph  and 
relational  models. 

The  following  sections  define  constraints  on  the  base  relations,  but  as  Just 
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pointed  out,  when  specified  in  complete  form  they  define  properties  of  the 
database  relations.  Whenever  a base  relation  Is  specified,  the  complete 
expression  defining  the  base  relation  should  be  substituted  for  the  base  relation 
name. 

Each  base  expression  was  defined  in  step  I as  a consequence  of  one  of  the 
four  requirements  of  step  II  in  Algorithm  8.1.  So  we  will  refer  to  each  base 
relation  as  a "type  i base  relation"  where  1<K4  and  where  the  types  have 
associated  names  as  follows: 

type  1 - Existence  and  Identity 

type  2 - Non-identifying  Association  Characteristics 

type  3 - Non-identifying  Roles 

type  4 - Non-identifying  Entity  Characteristics 

A.  The  first  kind  of  constraint  required  makes  sure  that  whenever  the 
roles' or  characteristics  of  an  association  type  are  defined  by  several  base 
relations,  every  association  instance  of  that  type  in  the  database  state 
described  by  one  such  relation  is  also  described  by  every  other  relation 
pertaining  to  that  association  type. 

In  the  case  of  a predicate  of  the  form  be  tppe,  we  want  to  make 
sure  that  if  the  entity  has  non-key  characteristics,  then  every  association 
described  by  the  type  1 base  relation  for  this  association  type  is  also 
described  in  the  type  4 base  relations  for  this  association.  In  the 
Figure  9.8  example,  we  want  to  make  sure  that  every  entity  identified  in 
base  relations  Be-employee-base  and  Be-machine-base  have  characteristics 


defined  in  Employee-char-base  and  Dachine-char-base  respectively. 
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In  the  case  of  other  than  be  type  associations,  we  want  to  make 
sure  that  the  identifying  characteristics  which  occur  In  its  type  1 base 
relations  also  occur  in  any  type  2 or  type  3 base  relations  associated  with 
the  appropriate  association  type.  In  the  Figure  9.8  example,  this  means 
that  the  sets  of  values  which  occur  in  the  Operation-base  relation  also 
occur  in  the  Operation-case-base  relation  specifying  the  agent  role  for 
each  association  of  type  operation. 

The  formal  requirement  is  as  follows: 

For  each  type  1 Baserelation j with  heading^,  for  each  type  2,  3 or 
4 Baserelation i for  the  same  predicate,  there  is  a constraint 

Equal  [ Project!  heading j,  Baserelation j) , Baserelation ). 

The  Equal  constraint  type,  defined  in  Figure  A3. 2 simply  checks  that 
two  relations  are  of  equal  form  and  have  identical  sets  of  tuples. 

In  Figure  9.8,  the  corresponding  constraints  for  this  step  are 
Cl  - C3.  Note  that  in  this  case  the  constraints  are  superfluous  because  in 
each  of  them  they  assert  the  equality  of  identical  projections  of  the 
database  relations.  When  the  information  about  a single  association  is 
distributed  over  several  relations,  this  is  not  the  case  and  the  constraints 
become  necessary  to  preserve  the  semantic  Integrity  of  the  database  state. 
Such  is  the  case  in  Figure  9.9  for  constraint  C2  where  the 
Machine-char-base-1  and  Be-machine-base  base  relations  are  defined  in 
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terms  of  the  Machine-age  and  Operations  database  relations  respectively. 

B.  We  want  to  make  sure  that  each  entity  in  the  database  state  has 
associated  wlfh  it  only  one  value  of  each  non-key  characteristic.  For  each 
type  4 Baserelationi  with  heading t,  there  is  a constraint  of  the  form 

Col-func  [\(h.cf)l  [(h,cf)l  Baserelationi  ] 

where  h is  the  top  level  column  grouping  for  heading ^ as  defined  in  step 
III.B.4  of  Algorithm  8.1,  cj  maps  each  key  characteristic  for  the  entity 
type  into  the  appropriate  domain  name,  and  /'  maps  the  non-key 
characteristic  into  its  domain. 

Col-func  enforces  constraints  similar  to  the  functional  dependencies  of 
[Codd  72a],  The  Col-func  constraint  type,  defined  in  Figure  A3.3  requires 
that  for  a group  of  columns,  whenever  two  tuples  in  a relation  have 
equal  values  for  all  characteristics  in  these  columns,  the  values  for 
another  specified  set  of  columns  must  also  be  equal.  This  constraint  is 
used  in  maintaining  in  the  relational  state  the  notions  of  key 
characteristics  and  mandatory  cases. 

Constraints  C4  and  C5  in  Figure  9.8  are  defined  as  a result  of  this 

step. 
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C.  Similar  to  the  constraints  of  B.,  we  want  to  make  sure  that  every 
association  described  in  the  database  state  has  associated  with  it  only  one 
value  of  each  non-key  characteristic  and  only  one  entity  for  each 

unrestricted  role. 

1.  For  each  type  2 Baserelation } with  heading},  there  is  a constraint 
of  the  form 

Col-func  t heading’,  ( p,  cf)  , Baserelation } ] 
where  p is  the  predicate  described  in  Baserelation },  cf  maps  the 
non-key  characteristic  into  its  domain,  and  heading’  is  heading} 
exclusive  of  ( p.  c ). 

Since  none  of  the  associations  in  Figure  8.6  have  characteristics, 
there  are  no  constraints  in  Figure  9.8  created  by  this  step.  However, 
constraint  C9  in  Figure  9.9  is  such  a constraint. 

2.  For  each  type  3 Baserelation } with  heading },  there  is  a constraint 
of  the  form 


Col-func  [ headingl , heading2,  Baserelation } ] 

where  heading2  defines  all  characteristics  in  Baserelation } describing 
the  unrestricted  case  and  headingl  is  heading } exclusive  of  headingZ. 

Constraint  C6  in  Figure  9.8  is  the  appropriate  constraint  for  the 

unrestricted  operate: agent  case. 
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D.  For  each  entity  of  a given  type  in  the  semantic  graph  state,  there  must 

be  a corresponding  tuple  in  each  base  relation  describing  the  be  type 

predicate.  We  want  to  make  sure  that  the  key  characteristics  in  cases  of 

other  predicates  identify  existing  entities. 

1.  For  each  type  1 Baserelation ^ not  specifying  a be  type  predicate, 
there  is  a constraint  for  each  case  in  the  relation  of  the  form 

Subset  [ Project!  ( ( Hp,s)l,e  ),  c / ),  Baserelation ^ ),  Baserelation j ] 

where  p is  the  predicate  described  in  this  Baserelation s is  the  case 
of  concern,  e is  the  entity  type,  cj  is  the  function  specifying  the  key 
characteristics  of  the  entity  type  and  Baserelation  j is  the  type  1 base 
relation  defined  for  the  be  e predicate. 

The  Subset  constraint  type,  defined  in  Figure  A3.4  checks  that  for 
two  relations  defined  for  a single  predicate:case  pair,  the  characteristic 
values  in  the  first  relation  specified  are  a subset  of  the  characteristic 
values  in  the  second  relation  specified.  Subset  does  not  check  that 
tuples  in  the  first  relation  are  a subset  of  the  tuples  in  the  second 
relation.  Rather  it  looks  only  at  the  characteristics  and  their  values. 
This  is  because  it  is  desired  to  compare  characteristics  describing 
different  predicate=case  pairs. 

For  example,  constraint  C7  in  Figure  9.8  is  such  a constraint 
stating  that  every  machine  number  which  occurs  in  the  Operate-base 
relation  also  occurs  in  the  Be-machine-base  relation.  But  note  that 
this  compares  the  set  of  tuples  describing  the  operate:ob ject  pair  to 
the  set  of  tuples  for  be  machine! object. 
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2.  For  each  type  3 Baserelationit  there  is  a constraint  of  the  form 

Subset  [ Projectt  ( ( Kp.stl.e  ),  cj  ) , Baserelation j ),  Baserelation j ] 

where  s is  the  unrestricted  case  defined  in  this  relation  and  p,  e.  cj 
and  Baserelation  j are  as  in  1 above. 

This  step  results  in  constraint  C8  of  Figure  9.8  specifying  that 

every  employee  name  Operate-case-base  also  occurs  in  the 

Be-employee-base  relation. 

E.  In  the  semantic  graph  schema,  it  may  happen  that  a particular  role  in 
some  association  type  is  specified  as  mandatory  for  some  entity  type. 
When  this  happens,  we  must  make  sure  that  the  relations  reflect  this 
feature  of  the  application. 

For  each  entity  type  e i E*.  for  each  association  a c A’,  and  role  r, 
such  that  fr*  (cr,r)  .mand  - mandatory  and  R’(o,r)-e,  define  a constraint 

Equate  [ Project  ( heading,  Baserelation ^ ),  Baserelation  j ] 

where  Baserelation j is  the  type  1 relation  if  r is  restricted  or  else  the 
type  3 relation  if  r is  unrestricted,  heading  is  the  column  group  in 
Baserelation ^ concerned  with  the  predicate:case  pair  Tp(o):Ts(r),  and 
Baserelation  j is  the  type  1 base  relation  for  the  be  e predicate. 

The  Equate  constraint  type,  defined  in  Figure  A3. 5 checks  that  for 
two  relations  defined  for  a single  predicate:case  pair,  the  characteristic 
values  in  thp  first  relation  specified  are  equal  to  the  characteristic  values 


•**-  Mnni|  relation  specified. 
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The  Equate  constraint  type,  like  Subset,  compares  only  the 
characteristic  values.  As  in  Subset  this  is  for  the  purpose  of  comparing 
columns  describing  different  predicatexase  pairs.  This  is  in  contrast  to 
the  Equal  constraint  type  which  requires  both  values  and  headings  to  be 
identical.  An  instance  of  an  Equate  constraint  could  be  expressed  with 
two  instances  of  the  Subset  constraint.  Equate  is  retained  for  clarity. 

Constraint  C9  in  Figure  9.8  states  that  every  machine  must  be  the 
object  of  an  operate  predicate  in  the  Opera te-base  relation. 

F.  For  each  Baserelation j,  we  define  the  constraint 

Non-null  [ Baserelationi  ]. 

The  Non-nut)  constraint  defined  in  Figure  A3.6  requires  that  the  null 
value  cannot  occur  anywhere  in  a specified  relation. 

This  simply  states  that  for  every  entity  in  the  database  state,  each 
characteristic  for  its  entity  type  is  defined  and  for  each  association,  every 
role  and  characteristic  for  its  association  type  is  defined.  This  results  In 
constraints  C10-C15  of  Figure  9.8. 

This  ends  the  specification  of  constraints  on  the  base  relations. 
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111.  So  far  we  have  defined  the  base  relations  and  their  constraints.  Now 
we  must  complete  the  definitions  of  the  constraints  by  specifying  how 
each  database  relation  is  composed  out  of  the  base  relations.  Since  each 
base  relation  is  the  projection  of  some  database  relation,  this  has  the 
effect  of  specifying  conditions  which  must  hold  within  a single 
database  relation  or  between  several  database  relations. 

If  each  database  relation  were  identical  to  a single  base  relation,  this 
last  set  of  constraints  would  not  be  necessary.  These  constrains  specify 
how  the  various  "basic"  facts  are  combined  to  form  any  view  of  the 
application  a user  wishes  to  see  in  a single  relation.  These  constraints  are 
somewhat  similar  to  the  definition  of  virtual  relations  in  syntactic 
relational  database  systems  (fCodd  74]  and  [Chamberlin  et.  al.]).  However 
the  concept  of  virtual  relations  is  closely  tied  to  the  implementation 
Issues  in  a database  system.  Our  use  of  base  relations  is  purely  for 
definitional  reasons. 

There  will  be  one  such  constraint  for  each  relation  in  the  database. 
This  constraint  expresses  the  equality  of  the  relation  with  a relation 
created  using  the  Conjunction,  Predicate-join,  Case-join  and  Project  operations 
on  the  base  relations.  The  Conjunction  operation  is  used  to  combine 
relations  with  statements  concerning  different  predicates  into  a single 
relation.  Predicate-join  combines  relations  describing  different  cases  or 
characteristics  of  a single  predicate  into  a single  relation.  Case-join 


combines  relations  which  specify  different  characteristics  of  an  entity 
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type  into  a single  relation.  Project  eliminates  from  a relation  those 
columns  or  predicates  which  are  not  of  interest. 

[Codd  70]  defines  the  join  operation  to  combine  two  syntactic 
relations  into  a single  relation.  The  use  of  this  operation  is  criticized  by 
[Chen]  as  it  may  allow  the  creation  of  a relation  which  has  no  semantic 
interpretation  in  the  application  being  described  in  the  database.  We  solve 
this  problem  by  defining  three  different  operations  to  form  a semantic 
relation  from  two  other  semantic  relations. 

The  procedure  for  defining  the  constraints  is  to  do  the  following  for 
each  flelationi  with  headingi  defined  in  r-schema. 
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A.  Build  the  expression  for  each  relation. 


1.  Include  the  desired  characteristics  of  each  entity  in  the  relation. 
For  each  predicate  j-.case^  pair  occurring  in  headingi  which  has 
non-key  characteristics  specified  by  heading define  the  following 
expression 


• Case-join  (({(be  e,  object)},  e),({(predicate  j.case^y.e), 

( be-e- head  Be-e-cha r- base n), 

Case  jom(({(be  e, object )}, e), ({(predicate  j.case^y.e), 
( be-e- head n_ j,Be-e-cha r- base n_j). 


Case-join  (({(be  e,object)}te),  ({(predicate  j.cose^y.e), 
(be- e-head p.Be-e-char- base  p). 


C.ase-join(({(be  e,object)},e),( {(predicate j,coseA)),e)f 
( be-e- head  j, Be-e-cha  r- base  t), 
(pred-head,rredicate  j- base)))). ..)))) 

where  e is  the  entity  type  associated  with  predicate  pcase^  by 
heading j.  Be-e-char-base j to  Be-e-char-base„  and  be-e-head  j to 
he-e-hcodn  are  the  type  4 base  relations  and  headings  for  the  n 
non-key  characteristics  of  entity  type  e mentioned  in  the 

predicate j-xase^  columns  of  heading f.  Predicate  j-base  and  pred-head 
are  the  base  relation  and  heading  defined  for  predicate  j.  This  base 
relation  is  of  type  1 if  case^  is  restricted  and  of  type  3 if  cas  e*  is 
unrestr icted. 

This  defines  a relation  consisting  of  all  of  the  key  characteristics 
of  the  entity  type  along  with  those  non-key  characteristics  occurring 
in  the  relation  for  the  specified  predlcateicase  pair. 

The  Case-join  function  is  defined  in  Figure  A3. 7.  This  function 
combines  the  characteristics  of  an  entity  type  occurring  in  two 

different  relations  into  a new  relation.  Tuples  are  matched  according 
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to  the  equality  of  the  common  characteristics  defined  in  the  relations 
for  the  entity  type  and  predicate:case  of  concern. 

For  example,  using  the  Employee  and  Operate  relations  of 
Figure  8.5, 

Ca  se-join  ( ( { ( be  employee.ob ject )}  .employee), 

( {operate.agent},employee). 

Employee, 

Operate) 

results  in  the  Operations  relation  of  Figure  9.4.  The  information 
concerning  the  employees’  ages  from  the  Employee  relation  was  added 
to  the  Operate  relation. 


Operations 


operate :agent 

be  machine -.object 
operate:object 

employee 

machine 

name 

age 

number 

type 

names 

years 

serial -numbers 

machine-types 

T .Manhart 

32 

NZ745 

lathe 

C .Gershag 

40 

JCL181 

press 

Figure  9.4  - Result  of  Case-join  Operation 


If,  for  a tuple  in  either  of  the  relations  being  Joined,  there  is  no 
tuple  in  the  other  relation  with  equal  values  for  the  common 
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characteristics,  then  the  information  from  the  unmatched  tuple  will 
not  appear  in  the  resulting  relation.  To  assure  that  this  operation 
does  not  result  in  false  statements  concerning  the  application,  this 
algorithm  will  only  use  the  operation  when  all  key  characteristics  of 
the  entity  type  of  concern  are  present  in  both  relations. 


2.  Include  the  desired  characteristics  and  roles  of  associations  in  the 
relation.  For  each  predicate j occurring  in  headingi  which  has  two 
or  more  unrestricted  cases  or  non-key  characteristics  in  heading 
define  the  following  expression 

Predicate-join  (predicate  j,(pred-headn,  Predicate  j-char-basen), 

Predicate-jom(  predicate j.(pred-headn_j,  Predicate  j-char-base„.j). 


Predicate-join(predicatey,(pred-Jiead3,  Predicate j-char-baseg). 

Predicate  join( predicate y,(pred-/iead2,  Predicate j-char-base2 ), 

( pred-headj , Predicate j-char-base  j)))....)) 

where  prcd-headj  to  pred-headn  and  Predicate j-char-base t to 
Predicate j-char-basen  are  the  type  2 headings  and  relations  for  each 
non-key  characteristic  and  the  type  3 headings  and  relations  defined 
for  each  unrestricted  case. 

If  predicate j has  no  unrestricted  cases  or  non-key  characteristics 
in  heading then  the  expression  defined  by  this  step  is  the  type  1 
base  relation  for  predicate  j. 

If  predicate  j has  one  non-key  characteristic  or  unrestricted,  then 
the  result  of  this  step  is  the  type  3 or  type  2 base  relation  for  the 
predicate. 

This  defines  a relation  which  contains  each  desired  unrestricted 
case  or  non-key  characteristic  for  the  predicate  along  with  the 


identifying  characteristics  (restricted  case  and  key  characteristics)  for 
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1 


the  corresponding  association  type. 

Predicate-join,  defined  in  Figure  A3.8  matches  information 
concerning  the  desired  predicate  according  to  those  cases  and 
characteristics  of  this  predicate  which  are  common  to  both  relations. 
For  the  Operate-case-base  and  Operate-char-base  relations  defined  in 
Figure  9.5,  the  Operate- join  relation  in  Figure  9.5  is  the  result  of 

Predicate-join(operate.Operate-case-base,Operate-char-base). 

Note  that  both  the  agent  case  and  the  cost -per -hour  characteristic  of 
the  operate  predicate  appear  in  Operate- join. 

If,  for  a tuple  in  either  of  the  relations  being  joined,  there  is  no 
tuple  in  the  other  relation  with  equal  values  for  the  common 
predicate  characteristics  and  cases,  then  the  information  from  the 
unmatched  tuple  will  not  appear  in  the  resulting  relation.  To  assure 
that  this  operation  does  not  result  in  false  statements  concerning  the 
application,  this  algorithm  will  only  use  the  operation  when  all  keg 
characteristics  of  the  predicate  type  of  concern  and  when  all  keg 
characteristics  of  the  restricted  roles  are  present  in  both  relations. 
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Operate-case-base 


operate:agent 

operate :object 

operate 

employee 

machine 

name 

number 

type 

shift 

names 

serial -numbers 

machine-types 

shifts 

T .Manhart 

m 

lathe 

1 

B.Belgim 

Kim 

lathe 

2 

C . Gershag 

JCL181 

press 

1 

Operate- join 


operate : agent 

operate:object 

operate 

machine 

number 

type 

shift 

cost-per-hour 

names 

serial-numbers 

machine-types 

shifts 

dol lars 

T .Manhart 

E'{9 

lathe 

1 

40 

B.Belgim 

mSSM 

lathe 

2 

SO 

C .Gershag 

JCL181 

press 

1 

75 

Figure  9.5  - Predicate- join  Example 
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3.  Include  the  combination  of  information  concerning  a single 
predicate.  For  each  predicate j occurring  in  heading ^ which  had  an 
expression  defined  in  step  i.  above,  define  the  following  expression 

Case  join(({( predicate  j,case„)),e), {{{predicate  j,caseH)),e), 
e*P„. 

Case-join(({(  predicate  y,caseB.j)},e),({(predtcate  j,caseB_  j)},e). 


Case-join(({(predicate  j,case2)),e),({(.predicate  j,cose2)},e), 
e*P 2’ 


Case-join(({(predicatey,casej)),e),({(pre</iccte  j.case  j)},e), 
expj,  expff))...)) 


where  exp0  is  the  relational  expression  defined  in  step  2 for 
predicate j and  exp ^ is  the  expression  defined  in  step  1 for  case^. 

If  no  expressions  were  defined  for  predicate  j in  step  1,  then  the 
result  of  this  step  us  exp0. 

This  adds  to  each  column  group  the  desired  non-key  characteristic 


for  the  entity  type  of  concern. 
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4.  Combine  the  different  predicates  into  a single  relation.  If  heading ^ 
contains  more  than  one  predicate,  then  define  the  following  expression 

Conjunction(cof-f  is  t-l„,col-l  is  t-2„, 
e*p„, 

Conjunctionfco/ -l  is  t-ln_j,col-l  is  t-2„_j, 
e*Pn-l • 


Conjunction(coI-/ist-f2.C0^  ts  t-22» 
exp2, 

Conjunction(col -list-1 j, col- 1 ist-2Jr 
expj,  exp0))...)) 

where  exp  j is  the  relational  expression  defined  in  step  3 for  each 
predicate  ^ Each  col-list-1  j is  a sequence  of  column  headings 
specifying  those  cases  and  their  entity  types  of  predicate  j which 
occur  in  heading ^ with  other  predicate:case  pairs.  Each  col-list-2  j 
lists  column  headings  consisting  of  the  entity  type  and  predicate:case 
pairs  defining  column  groups  in  exp  j.  Corresponding  columns 
specified  by  the  two  lists  will  be  combined. 

If  there  is  only  one  predicate  in  heading then  the  result  of  this 
step  is  the  expression  defined  in  step  3. 

This  defines  a single  relation  which  contains  information 
concerning  all  of  the  predicates  which  are  to  be  in  the  corresponding 
database  relation. 

Conjunction  is  defined  in  Figure  A3. 9.  The  arguments  to  the 
function  specify  two  relations  and  two  lists  of  column  groups. 
Corresponding  entries  in  the  two  lists  must  define  predicate:case  pairs 
defined  on  the  same  entity  type  and  the  same  characteristics.  It  is 
also  required  that  no  single  predicate  appears  in  both  relations.  If  this 


were  the  case,  semantic  ambiguity  might  result.  Unlike  the  situation 
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with  Case  join  and  Predicate-join,  the  result  of  a Conjunction  operation 
contains  all  information  present  in  both  arguments  to  the  operation. 

The  heading  produced  by  the  function  is  formed  by  combining 
the  specified  column  groups  and  retaining  all  of  the  other  columns  in 
both  relations.  The  tuples  in  the  resulting  relation  are  formed  by 
combining  tuples  from  the  two  relations  which  have  identical  values 
in  the  merged  columns  and  by  adding  null  values  to  all  other  tuples. 

For  example, 

Conjunction({({(supervise,object)},employee)}, 

{({(operate.agent)}, employee)}. 

Supervise-base, 

Operators) 

where  relations  Supervise-base  and  Operators  are  defined  in  Figure  9.6 
results  in  the  Jobs  relation  of  Figure  9.6.  Note  that  the  null  value 
was  introduced  since  B.Belgim  has  no  supervisor. 

B.  Finally  define  the  constraint 

Equal[  Relation j,  Project(/leo«/tnp^ , exp^)] 

where  exp0  is  the  relation  expression  defined  in  step  4. 

The  Project  operation  in  this  constraint  eliminates  any  of  those 
columns  used  in  the  Case-join  and  Predicate-join  operations  which  do  not 
appear  in  the  final  headingi.  Such  columns  would  be  Key  characteristics 
of  restricted  cases  or  key  characteristics  of  predicates. 

For  the  relations  of  Figure  8.5,  these  steps  result  in  the  constraints 


/ 


superv1se:agent 

supervlserobject 

employee 

employee 

name 

name 

names 

names 

G .Wayshum 

T.Manhart 

G . Wayshum 

C .Gershag 

Operators 


operate  :agent 

operate:object 

employee 

machine 

name 

number 

type 

names 

serial-numbers 

machine-types 

B .Belgim 

wsa 

lathe 

T.Manhart 

lathe 

C .Gershag 

JCL181 

press 

Jobs 


superv1se:agent 

supervise:object 

operate:agent 

operate :object 

employee 

employee 

machine 

name 

name 

number 

type 

names 

names 

serial-numbers 

machine-types 

— 

B.  Belgim 

lathe 

G. Wayshum 

T .Manhart 

lathe 

G .Wayshum 

C .Gershag 

JCL181 

press 

Figure  9.6  - Example  of  Conjunction  Operation 
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Cl 6’  and  Cl 7'  of  Figure  9.10  at  the  end  of  the  chapter.  The  constraints 
C 1 6 and  C17  of  Figure  9.8  are  derived  from  C16'  and  Cl 7*  by  use  of  the 
simple  equivalence  rules  shown  in  Figure  9.3. 

This  completes  the  formal  description  of  the  constraint  definition 
procedure.  The  use  of  the  stated  procedure  and  the  three  simplifying  rules 
result  in  the  constraints  shown  in  Figure  9.9  for  the  relational  schema  shown 
in  Figure  8.2.  These  constraints  include  the  various  types  of  constraint 
operations  not  shown  in  Figure  9.8.  The  constraints  in  these  figures  should  be 
considered  in  detail  to  get  a feeling  of  how  they  reflect  the  features  of  the 
graph  schema. 

End  of  Algorithm  9.1. 

The  restricted  semantic  relation  data  model  is  then  formally  defined. 

Definition  9^2:  The  restricted  semantic  relation  data  model  consists  of  the  set  of 
all  semantic  relation  application  models  with  schemas  derivable  by  Algorithm 
9.1. 


I 
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9.4  Discussion 

I 

While  the  constraints  defined  by  Algorithm  9.1  are  specified  as 
compositions  of  syntactic  expressions  in  a canonical  form,  it  is  important  to 
remember  that  it  is  the  function  represented  by  the  expressions  which  define 
the  constraints  in  the  schema.  Consequently  there  could  be  many  other  ways  of 
expressing  each  constraint  or  combination  of  constraints.  The  rules  In 
Figure  9.3  are  only  a few  simple  equivalences.  There  are  many  other  such 
equivalences.  [Sagiv  and  Yannakakis]  show  that  the  determination  of 
equivalence  for  syntactic  relational  algebra  expressions  involving  only  projection 
and  join  is  NP-complete.  The  problem  is  certainly  no  simpler  in  the  case  of  the 
semantic  relation  operations. 

We  will  now  again  describe  some  of  the  ways  in  which  relational 
application  models  can  be  defined  which  have  no  equivalent  graph  application 
model.  It  is  possible  that  either  the  constraints  are  not  strict  enough,  allowing 
relational  states  with  no  equivalent  valid  graph  state,  or  the  constraints  may  be 
too  strict,  which  would  result  in  excluding  from  the  set  of  valid  relational 
states  that  state  which  is  equivalent  to  some  valid  graph  state. 

If  Constraint  C3  was  eliminated  from  Figure  9.9,  then  the  constraints  are 
not  strict  enough  since  this  would  allow  statements  specifying  a machine's  age 
without  specifying  its  location.  The  graph  model,  however,  requires  that  all 
characteristics  be  defined. 


in  Chapter  8 an  example  was  presented  of  over-constraining  the  database 
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state.  Using  the  formal  notation  which  has  now  been  Introduced,  we  express 
the  constraint  from  this  example  as  Figure  9.7.  When  applied  to  the  relations 
of  Figure  8.2.  this  constraint  says  that  the  types  of  every  machine  operated  by 
the  employees  of  a given  supervisor  must  be  equal.  As  discussed  in  Chapter  8, 
such  a constraint  is  not  expressible  in  the  semantic  graph  data  model.  In  effect, 
this  constraint  specifies  properties  which  must  be  satisfied  by  nodes  in  the 
semantic  graph  state  which  are  not  connected  by  an  edge.  The  semantic  graph 
operations  do  not  enforce  such  constraints. 


Figure  9.7  - Constraint  Causing  Non-equivalence 
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We  conclude  this  chapter  with  a summary  of  the  main  steps  in  the 
derivation  of  restricted  semantic  relation  schemas.  Chapter  10  then  looks  at  the 
problem  of  the  equivalence  of  the  semantic  graph  and  semantic  relation  data 
models. 


Algorithm  8.1: 

I.  Define  an  augmented  graph  schema. 

II.  Select  subgraphs  of  the  augmented  schema  defining  that  portion  of  the 
application  which  will  be  described  in  each  relation,  making  sure  that  four 
types  of  requirements  are  satisfied. 

III.  Define  the  relational  headings  and  other  schema  components  except  for 
the  constraints. 

Algorithm  9.1: 

I.  Define  base  expressions  as  projections  of  the  database  relations  so  that 
there  is  a base  expression 

-for  each  entity  type 
-for  each  association  type 

-for  each  non-key  characteristic  of  an  association  type 
-for  each  unrestricted  role  of  an  association  type 
-for  each  non-key  characteristic  of  an  entity  type. 

II.  Define  constraints  on  the  base  relations  so  that 

-all  appropriate  characteristics  are  defined  for  each  entity  and  each 
association 

-all  appropriate  roles  are  defined  for  each  association 
-the  key,  restricted  and  mandatory  features  are  enforced. 

III.  Define  constraints  showing  how  each  database  relation  is  related  to  the 
set  of  base  relations. 
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Base  Expressions: 


8e-machine-base  = Project( 


be  machine: object 


machine 


number 


serial -numbers 


, Operate) 


8e-employee-base  = Project( 


be  employee :object 


employee 


name 


names 


, Employees) 


Operate-base  = Project( 


operate :object 


machine 


number 


serial-numbers 


, Operate) 


Operate-case-base  = Project( 


operate .’agent 

operate: object 

employee 

machine 

name 

number 

names 

serial -numbers 

, Operate) 


Employoe-char-base  = Employees 


Figure  9.8  - Constraints  for  Figure  8.5 
(Page  1 of  4) 
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C4.  Col-func[ 


be  employee: object 

be  employee  .-object 

employee 

• 

employee 

name 

age 

names 

years 

, Employee-char-base  ] 


C5.  Col-func[ 


be  machine: object 


machine 


number 


serial-numbers 


be  mach1ne:object 


machine 


type 


machine-types 


, Hachlne-char-base] 


C6.  Col-func[ 


oporate:object 

operate :agent 

machine 

» 

employee 

number 

name 

ser  ial -numbers 

names 

, Operate-case-base] 


C7.  5ubset[ Project! 


operate -.object 


machine 


number 


serial-numbers 


, Operate-base),  Be-machtne-base] 


Figure  9.8  - Constraints  for  Figure  8.5 
(Page  3 of  4) 
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C8.  Subsetf Project! 


operate :agent 


employee 


name 


names 


,Operate-case-base ) , Be-employee-base  ] 


C9 . Equalc[ Project! 


operate :object 


machine 


number 


serial -numbers 


, Operate-base),  Be-machine-base] 


CIO.  Non-nu 1 1( Be-machine-base] 
Cll.  Non-nul 1[  Be-employee-base] 
C12.  Non-nul l[Opcratc-base] 

C13.  Non-nul l[Operate-case-base] 
CM . Non-nul  1[fmployee-char-base] 
C15.  Non-nul 1[Machine-char-base] 


C16.  EqualfOperate, 

Conjunction! 


be  machine :object 

operate :object 

machine 

# 

machine 

Machine-char-base, 
Case- join! 


be  machine robject 

operate :object 

machine 

9 

machine 

Machine-char-base, 
Operate-case-base ) ) ] 


C 1 7 . Equal[Employees,  Employee-char-base] 


Figure  9.8  - Constraints  for  Figure  8.5 
(Page  4 of  4) 
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Base  Expressions: 


Be-mnchine-base  = Project( 


be  mach1ne:object 


machine 

number 

type 

serial-numbers 

machine-types 

, Operations) 


Be -employee -base  = Project( 


be  employee:object 


employee 


name 


names 


, Supervlse-operate) 


Operate-base  = Project( 


operate:object 

operate 

machine 

number 

type 

shift 

serial-numbers 

machine-types 

shifts 

Operations) 


Supervise-base  = Project( 


supervise:agent 

supervise :object 

employee 

employee 

name 

name 

names 

names 

Supervlse-operate) 


Figure  9.9  - Constraints  for  Figure  8.2 
(Page  1 of  9) 
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Employee-char-base  = Project( 


be  employee :object 

employee 

name 

age 

names 

years 

, Supervlse-operate/ 


Mach ine-char-base- 1 = Machine-age 


Machine-char-base-2  = Project^ 


be 

machineiobject 

machine 

number 

type 

location 

serial -numbers 

machine-types 

locations 

Operations) 


Operate-char-base  = 


Project( 


operate  robject 

operate 

machine 

number 

type 

shift 

cost-per-hour 

serial -numbers 

machine-types 

shifts 

dol lars 

Operations) 


Figure  9.9  - Constraints  for  Figure  8.2 
(Page  2 of  9) 
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Oporate-case-base  = 


Projcct( 


operate :object 

operate :agent 

operate 

machine 

employee 

number 

type 

name 

shift 

serial-numbers 

machine-types 

names 

shifts 

Operations) 


Constraints 


Cl.  fqua l[ Project( 


be  employee -.object 


employee 


name 


names 


, Employee-char -base),  Be-employee-base] 


C Z.  Fqual[Project( 


be  machine :object 


machine 


number 

type 

serial-numbers 

machine-types 

,Mach1ne-char-base-l ), 


Be-machine-base] 


Figure  9.9  - Constraints  for  Figure  8.2 
(Page  3 of  9) 
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be  machine:object 


C3.  EqualfProject( 


machine 


,Machine-char-base-2 ) , 


number 


serial-numbers  machine-types 


Be-machine-base] 


C4 . Equa 1[ Project( 


operate :object 


machine 


operate 


number 


shift 


serial-numbers  machine-types  shifts 


Operate-char-base) , 
Operate-base] 


C5.  Equa l[Project( 


operate :object 


machine 


number 


operate 


shift 


serial-numbers  machine-types  shifts 


Operate-case-base) , 
Operate-base] 


be  employee :object  be  employee :object 


C6.  Col-func[  employee 


employee 


,Employee-char>base] 


years 


Figure  9.9  - Constraints  for  Figure  8.2 
(Page  4 of  9) 
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Machine-char-base-1 ] 


Machlne-char-base-2 ] 


Operate-char-base] 


Operate-case-base] 


Figure  9.9  - Constraints  for  Figure  8.2 
(Page  6 of  9) 


•a-- 
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€11.  Subset) Project) 


be 

machine:object 

machine 

number 

type 

serial-numbers 

machine-types 

, Operate-base) , 


Be-machine  base] 


Cl?.  Subset! Projects 


operate :agent 


employee 


name 


names 


, Opera  te-case-base),  Be-employee-base] 


C13.  Subset.)  Projcct( 


supervise:agent 


employee 


name 


names 


, Supervise-base),  Be-employee-base] 


C14.  SubsetL Project) 


supervise -.object 


employee 


name 


names 


, Supervise-base),  Be-employee-base] 


Figure  9.9  - Constraints  for  Figure  8.2 
(Page  6 of  9) 


Be-mach ine-base] 


CIO.  Nnn-nu  1 1|  Be-mach ine-base ] 

Cl/.  Non-nu 1 1[ Be -employee -base] 
rin.  Non-nil  1 Operate-base] 

CIO.  Non-nu 1 l[Suporvi sc-base] 

C20 . Non-nul 1[ 1 mployee-char-base] 

C2 1 . Non-nul  1[ Mach  ine-char -base- 1 ] 

C ?Z  . Non-nu  1 l[Mach  ine-char -base-? ] 

C?3 . Non-nu 1 l[Operate-char-base] 

C 24  Non-nu 1 1[ Opera te-case-base ] 

C?!>.  EquallMachine-age,  Machine-char-base- 1 ] 


Figure  9.9  - Constraints  for  Figure  8.2 
(Page  7 of  9) 
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f 


C26.  Equal[l  oc.it Ions-used, 
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i-used, 

operate:object 

nnomt  a 

machine 

location 

shift 

locations 

shifts 

Case-jo1n( 


be  mach1ne:object  operate:object 


machine 

i 

Machlne-char-base-2, 

Operate-base))] 


C 27.  Equal[Operat Ions, 

be  machine:obj 

Conjunction(  

machine 

Machlne-char-base-2, 


machine 


be  machinerobject 

operate :object 

machine 

» 

machine 

Mach ine-char -base-2, 
Operate-base), 

Predicate- jo 1n( operate, 

Operate-case-base, 
Operate-char-base) ) ) ] 


Figure  9.9  - Constraints  for  Figure  6.2 
(Page  8 of  9) 
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C?8.  EgualfSupervIse-operate, 


ProJoct( 


supervise -.agent 

be  employee:object 
supervise :object 
operate : agent 

operate :object 

employee 

employee 

machine 

name 

name 

age 

number 

type 

names 

names 

years 

serial -numbers 

machine-types 

be  employee  .‘object 

Conjunctlon(  

employee 

Employee-char-base, 


superv1se:object 
operate: agent 

employee 


Employee-char-base, 

Supervise-base), 


Case- jo1n( 


be  employee: object 
employee 


operate: agent 
employee 


Employee -char -base. 
Operate -case -base ) ) ) ) ] 


Figure  9.9  - Constraints  for  Figure  8.2 
(Page  9 of  9) 
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C 1 6 * . Equa 1[ Operate , 


Projcct( \ 


operate: agent 

operate:object 

j 

employee 

machine 

name 

number 

type 

names 

serial -numbers 

machine-types 

Conjunct1on( 


be  machine:object 

operate  .-object 

machine 

» 

machine 

Case- Join( 


be  machine:object 

be  machinerobject 

machine 

* 

machine 

Case-jo1n( 


be  machine:object 

be  machinerobject 

machine 

» 

machine 

Mach  ine-char -base, 
Be-machlne-base), 
Be-machlne-base), 


Case-Jo1n( 


operate :object 

operate  robject 

machine 

f 

machine 

Case-Join( 


be  machine:object 

operate: object 

machine 

9 

machine 

Mach  ine-char -base, 
Operate-base), 
Operate-case-base ) ) ) ] 


Figure  9.10  - Unreduced  Constraints  of  Figure  9.8 
(Page  1 of  Z) 
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C 1 7 ' . F qua  If  F mp  loyees. 


be  employee robject 

Project( 

employee 

name 

age 

names 

years 

Fmployee- char -base. 
Be -employee- base), 
Be-employee-base ) ) ] 


Figure  9.10  - Unreduced  Constraints  of  Figure  9.8 
(Page  Z of  2) 
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Chapter  10  - Constructive  Proof  of  Data  Model  Equivalence 

This  chapter  shows  by  the  construction  of  equivalent  operations  that  the 
restricted  semantic  relation  and  the  semantic  graph  data  models  are  state 
dependent  equivalent  (as  defined  in  Chapter  7).  The  approach  taken  is  to  show 
the  equivalence  of  any  pair  of  application  models  consisting  of  a semantic  graph 
application  model  and  a semantic  relation  application  model  whose  schema  was 

derived  from  the  graph  schema  by  Algorithms  8.1  and  9.1. 

I 

This  can  be  done  by  induction  for  each  such  pair  of  application  models  in 
the  following  manner: 

I.  Show  that  the  Initial  states  are  equivalent. 

II.  Show  that  for  any  given  pair  of  valid  (e.g.  reachable)  equivalent  graph 
and  relational  states,  and  for  any  given  valid  semantic  graph  operation,  an 
equivalent  relational  operation  can  be  constructed.  (That  is,  the  states 
resulting  from  the  operations  applied  to  the  given  equivalent  states  are 
also  equivalent.) 

III.  Show  that  for  any  given  pair  of  valid  (e.g.  reachable)  equivalent 
graph  and  relational  states,  and  for  any  given  valid  semantic  relation 
operation,  an  equivalent  (possibly  composed)  graph  operation  can  be 
constructed. 

If  this  can  be  shown  for  every  such  pair  of  application  models,  then  it  proves 
that  the  restricted  semantic  relation  and  the  semantic  graph  data  models  are 
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state-dependent  equivalent.  Note  that  the  examples  of  Chapter  7 show  why 
state-dependent  equivalence,  as  opposed  to  one  of  the  other  forms  of 
equivalence,  must  be  considered. 

While  a proof  to  a detailed  level  would  be  desirable  to  show  the 

correctness  of  an  actual  implementation  based  on  these  constructions  of 

equivalent  operations,  such  a proof  would  be  extremely  complicated  and  lengthy. 
It  would  involve  proofs  concerning  every  clause  of  the  definitions  in  Appendices 
1,  2 and  3,  and  every  step  of  Algorithms  8.1  and  9.1.  Such  a proof  would  be 

best  done  with  some  sort  of  mechanized  aid.  However,  we  still  hope  the  more 

informal  arguments  justifying  the  claims  of  this  chapter  are  convincing. 

It  is  important  to  realize  that  this  approach  is  not  meant  to  imply  the 
unimportance  of  detailed  formal  proofs.  On  the  contrary,  it  points  out  the  great 
Importance  which  should  be  attached  to  finding  workable  methodologies  for 
proving  program  correctness. 


10.1  State  Equivalence  for  the  Restricted  Semantic  Belation  Data  Model 

At  this  point,  we  return  to  the  problem  of  defining  the  equivalence  of 
graph  and  relational  database  states.  As  mentioned  in  Chapter  7,  one  approach 
would  be  to  show  that  the  graph  state  is  a model,  in  the  formal  logic  sense,  for 
the  set  of  predicate  calculus  statements  associated  with  the  relational  state. 
This  approach  would  require  a formal  specification  of  the  set  of  predicate 
calculus  statements  associated  with  a given  relation.  This  approach  would  be 


- 215  - 


appropriate  for  any  semantic  relation  state,  not  Just  those  in  the  restricted 

semantic  relation  data  model. 

However,  we  will  take  an  approach  which  relies  on  the  canonical  form  of 
the  constraints  in  the  semantic  relation  data  model.  The  advantage  of  this 
approach  is  that  It  is  more  closely  tied  to  the  process  of  deriving  equivalent 
operations  for  two  given  application  models. 

Equivalence  will  be  determined  by  comparing  the  base  relations  (as  defined 
in  Algorithm  9.1)  of  a relational  state  with  the  graph  state.  This  approach  is 

related  to  the  work  of  [Hall  et.  al.]  which  advocates  the  design  of  relational 

databases  in  terms  of  "irreducible"  or  "basic"  relations  which  represent 

non-redundant  "basic  facts"  concerning  the  application.  The  base  relations  of 
this  thesis  also  represent  such  "basic  facts".  We  will  use  Definition  7.1  to 
specify  the  criteria  for  state  equivalence  against  which  we  will  measure  the 


new  definition. 
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Theorem  10.1:  Given  a valid  (by  Definition  6.2)  restricted  semantic  relation 
database  state  r-state  in  a restricted  semantic  relation  database 
((r-schema,  r-state0,  r-op-types),  r-state)  and  a valid  (by  Definition  5.2) 
semantic  graph  database  state  g-state  - (E,AJJs,c)  in  a semantic  graph  database 
((g-schema,  g-state0,  g-op-types),  g-state)  where  r-schema  is  derived  from 
g-schema  by  Algorithms  8.1  and  9.1,  then  r-state  is  state  equivalent  to  g-state 
if  and  only  if  both  g-state  and  r-state  equal  error  (signifying  that  neither 
represents  a possible  application  state)  or  when  considering  the  projections  of 
the  relations  in  r-state  which  are  defined  as  base  relations  in  the  schema 
derivation,  the  following  are  true: 

1.  Existence  and  Identity 

a.  Entities.  For  each  type  1 base  relation  for  a predicate  of  the  form 
be  e,  the  set  of  tuples  in  the  base  relation  equals 

(tuple:  Tuple  | (3e-td:E-id)  (e-idcf  a e-icf.type-e 
a dorn( tuple)  - (( ((be  e, object)) ,e)l 
a tuple!  (I  (be  e,  object)  I , e) ) - e-id.chars) ) 

This  means  that  for  every  entity  in  the  graph  state  there  is  a tuple 
in  the  corresponding  type  1 base  relation  consisting  of  the  identifying  key 
characteristics  of  the  entity.  Such  a tuple  specifies  the  semantic  graph 


characteristic  edges  for  these  characteristics. 
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b.  Associations.  For  each  type  1 base  relation  for  a predicate  p other  than 
be  e,  the  set  of  tuples  in  the  base  relation  equals 

(tuple:  Tup  I e | (3a-td:  A-  id)  ( a-id(A  a a-id.type  ■ Tp"^(p) 

a tupledp-*  (p) ) - a-id. chars 
a dom  (o-id.roles)  - 

(r:R  | ( ( (Tp-1  (p) , Ts(r)  )l  , o-id.roles  (r)  .type)  c dom  (tuple)  I 
a (Vr:R)  (r< dom  (o-id.roles) 

3 tuple!  (I  (Tp~l  (p) , Ts  ( r) ) I , o-id.roles  (r). type) ) - 

a-t d.roles  ( r)  .chars) ) I 

Tp  and  Ts  are  the  translation  functions  defined  in  Algorithm  8.1. 

This  means  that  for  each  association  in  the  graph  state  there  is  a 
tuple  in  the  corresponding  type  1 base  relation.  This  tuple  consists  of  the 
Key  characteristics  of  the  association  and  the  Key  characteristics  of  the 
entities  in  the  restricted  roles  of  the  association.  It  therefore  specifies 
those  characteristic  and  role  edges. 
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2.  Non-identifying  Association  Characteristics.  For  each  type  2 base  relation 
describing  the  non-key  characteristic  c of  a predicate  p,  the  set  of  tuples  in  the 
base  relation  equals 

(tup(e:Tuple  | (3c-id:A-id)  ( a-idiA  a c-td.type  ■ Tp“^(p) 

a dom(tupie(Tp~l  (p) ) ) « dom(a-t</.chars)u  (cl 
a (Vc':C)  (c'rdomia-td.chars) 

d tup/e(Tp_1 (pi) (c'l  - o-td.chars(c’)  I 
a tupiedp-!  (p) ) (c)  - c(a-id, c) 
a dom  (o- id. roles)  - 

ir:R  | ( HTp-^  (p) , Ts  (r) ) I , o-id.roles(r).type)  i dom(tuple)) 
a (Vr:R)  (rcdom(a-id.roles) 

o tuplei  (( (Tp-^  (p) , Ts(  r) ) i , o-id.roles  (r).  type) ) - 

o- 1 d.roles (r) .chars) )) 

This  requirement  means  that  for  each  non-key  characteristic  of  an 
association  in  the  graph  state,  there  is  a tuple  containing  the  characteristics  to 
identify  the  association  along  with  the  same  value  for  the  non-key  characteristic 
as  specified  by  the  graph  state  characteristic  edge. 
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3.  Non-identifying  Roles.  For  each  type  3 base  relation  for  a predicate  p 
describing  the  unrestricted  case  s,  the  set  of  tuples  in  the  base  relation  equals 

(tuple: Tuple  | (3o— t</:A- id)  ( a-idiA  a o-id.type  « Tp“^(p) 

a tuple(Tp-^  (p) ) - o-id.chars 
a (r:R  | («-td,r)cdom(r))  - 

lr:R  | ( KTp-^  (p)  ,Ts(r))},o-i<l.roles(r).type)  c dom(tupie)) 

U (Ta_1  (sH 

a (Vr:R)  (rtdom (o-tcf.roles) 

3 tuple(  ( ( (Tp~*  Ip)  ,Ts(r) )) , a-id.roles(r).type) ) - 

a-i d.roles  ( r)  .chars) ) ) 

This  requires  a tuple  for  each  role  edge  pertaining  to  an  unrestricted  role 
in  the  graph  state. 

4.  Non-identifying  Entity  Characteristics.  For  each  type  4 base  relation  for  the 
be  e predicate  describing  the  non-key  characteristic  c of  the  entitles  of  type  e, 
the  set  of  tuples  in  the  base  relation  equals 

I tupl  e:  Tup  I e | (3e-id:E-  id)  (e-idiE  a e-id.type-e 
a dom(tuple)  » ((((be  e, object)  I , e)  I 

a dom(tuple(  ( l(be  e, object))  ,e)))  - dom(e- id. chars)  U (c) 
a (Vc':C)  (c‘cdom(tupte( ( ( (be  e, object )), e) ) ) 

o tuple! (I (be  e,  object ) I , e) ) (c  ’ ) - c(e-td,c'))l 


This  requires  a tuple  corresponding  to  each  characteristic  edge  for  an 
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entity's  non-key  characteristics. 

This  ends  Theorem  10.1. 

Proof  of  Theorem  10.1: 

Now  we  will  prove  that  Theorem  10.1  fulfills  the  equivalence 
requirements  of  Definition  7.1.  We  consider  each  clause  of  Definition  7.1. 


1.  For  both  states,  only  one  value  is  specified  for  a single  characteristic 
of  an  entity  or  association. 

This  is  true  of  valid  semantic  graph  database  states  since  by  definition 
there  may  be  only  a single  characteristic  edge  of  any  given  type  for  any  given 
entity  or  association  node.  This  requirement  is  guaranteed  to  be  satisfied  for 
valid  restricted  semantic  relation  states  by  the  constraints  defined  in  steps  II.A, 
II. B,  Il.C.l  and  II.F  of  Algorithm  9.1. 

The  definition  of  Non-null  constraints  in  II.F  guarantees  that  no  null  values 
are  assigned.  The  Equal  constraints  of  II.A  guarantee  that  each  base  relation  has 
identical  sets  of  values  for  identifying  roles  and  characteristics.  II. B guarantees 
by  Col-func  constraints  that  a unique  non-key  characteristic  is  assigned  to  each 
entity  while  Il.C.l  guarantees  the  same  for  associations. 

Since  all  Case-joins  in  the  constraints  of  step  III  Include  the  entity  type's 
key  characteristics  and  since  all  Predicate- joins  in  step  III  include  the  key 
characteristics  of  the  association  and  restricted  roles,  the  necessary  conditions 
are  also  met  in  the  database  relations  seen  by  the  user. 


2.  For  both  states,  only  one  entity  may  be  specified  as  filling  a given 
role  in  an  association. 

This  is  true  of  valid  semantic  graph  database  states  since  by  definition 
there  may  be  only  a single  role  edge  of  any  given  type  for  any  association 
node.  This  requirement  is  guaranteed  to  be  satisfied  for  valid  restricted 
semantic  relation  states  by  the  constraints  defined  in  steps  II. A,  II. C. 2 and  II.D  of 
Algorithm  9.1.  II.A  requires  the  equality  of  the  restricted  roles  in  each  base 
relation  while  II.C.2  requires  functionality  from  the  identifiers  of  an  association 
to  the  key  characteristics  fo  the  entity  filling  an  unrestricted  role. 
Furthermore,  II.D  guarantees  by  Subset  constraints  the  > existence  of  each  entity 
specified  as  filling  a role. 

Requirements  3-6  are  satisfied  by  the  existence  of  the  required  mappings 
between  the  base  relation  of  a semantic  relation  database  state  and  an  equivalent 
semantic  graph  state. 

3.  The  states  define  equal  sets  of  entity  and  association  identifiers. 

This  is  satisfied  for  equivalent  relation  and  graph  states  by  the  Existence  and 
Identity  clause  of  Theorem  10.1. 

4.  For  each  entity,  the  same  characteristics  are  defined  in  each  state  and 
are  of  equal  value. 

This  is  satisfied  in  Theorem  10.1  by  the  Existence  and  Identity  clause  for  key 
characteristics  and  the  Non-identifying  Entity  Characteristic  clause  for  non-key 


characteristics. 
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5.  For  each  association,  the  same  characteristics  are  defined  In  each  state 
and  are  of  equal  value. 

This  is  satisfied  in  Theorem  10.1  by  the  Existence  and  Identity  clause  for  keg 
characteristics  and  the  Non-identifying  Association  Characteristic  clause  for 
non-key  characteristics. 

6.  For  each  association,  the  same  roles  are  defined  in  each  state  and  are 
filled  by  the  same  entity. 

This  is  satisfied  in  Theorem  10.1  by  the  Existence  and  Identity  clause  for 
restricted  roles  and  the  Non-identifying  Association  Role  clause  for 
unrestricted  roles. 

End  of  Proof. 

10.2  Proof  of  Data  Model  Equivalence 

The  following  theorem  asserts  the  state-dependent  equivalence  of  the 
semantic  graph  and  the  restricted  semantic  relation  data  models.  It's  proof  is 
quite  lengthy.  However,  an  outline  of  the  proof  will  be  presented  to  show  the 
essential  ideas  of  the  proof.  The  discussion  following  the  proof  should  be 
informative  even  if  the  proof  itself  is  not  read  in  detail. 
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Theorem  10.2:  The  restricted  semantic  relation  data  model  and  the  semantic 
graph  data  model  are  state-dependent  equivalent. 

Proof  of  Theorem  10.2: 

As  mentioned  In  the  introduction  to  this  chapter,  this  proof  is  based  on  an 
inductive  proof  presenting  the  constructions  of  equivalent  operations  to  show 
the  state-dependent  equivalence  of  any  pair  of  semantic  graph  and  semantic 
relation  application  models  where  the  relational  schema  was  derived  form  the 
graph  schema  by  Algorithms  8.1  and  9.1.  We  will  show  equivalence  for  any 
such  pair  of  application  models. 

The  application  model  equivalence  relation  so  defined  is  onto  the  semantic 
graph  data  model  because  Algorithms  8.1  and  9.1  allow  the  definition  of 
restricted  semantic  relation  schemas  for  any  graph  schema.  This  application 
model  equivalence  relation  is  also  onto  the  restricted  semantic  relation  data 
model  since  every  restricted  semantic  relation  schema  is  derived  form  some 
graph  schema  by  Algorithms  8.1  and  9.1.  This  will  prove  Theorem  10.2 
according  to  the  definition  of  state-dependent  data  model  equivalence. 
Definition  7.7. 
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Consider  semantic  graph  application  model 

g-app  - (g-schema,  g-state0,  g-op- types) 

where: 

1.  g-stateQ  is  the  empty  state. 

2.  g-op- types  defines  the  semantic  graph  operation  types, 
and  semantic  relation  application  model 

r-app  » (r-schema,  r-stateQ,  r-op-types) 

where: 

1.  r-schema  is  derivable  from  g-schema  by  Algorithms  8.1  and  9.1 

2.  r-stateQ  is  the  empty  state. 

3.  r-op-types  defines  the  restricted  semantic  relation  operation  types. 

Then  g-app  and  r-app  are  shown  to  be  state-dependent  equivalent  by  the 
construction  of  state-dependent  equivalent  operations.  An  outline  of  the  proof 
follows. 

I. .  Show  that  initial  states  g-stateQ  and  r-stateQ  are  equivalent. 

II.  Show  for  any  given  pair  of  valid  equivalent  states  g-state  and  r-state,  and 
a given  valid  semantic  graph  operation  g-op,  where  g-op(g-state)  - g-state*, 
that  a valid  relational  operation  r-op  can  be  constructed  so  that  r-state’ , where 
r-op(r-state)  - r-state’,  is  equivalent  to  g-state’.  Note  that  we  will  not 
need  to  use  composed  relational  operations  to  obtain  the  necessary  equivalence 
properties.  Consider  the  following  cases: 
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g-state’  * error 

1.  The  type  of  g-op  is  Insert-ind-entity,  Insert-ind-association,  or 

Insert-semantic-unit. 

2.  The  type  of  g-op  is  Delete-ind-entity,  Delete-ind-association,  or 

Delete-semantic-unit. 

g-state'  - error 

III.  Show  for  any  given  pair  of  valid  equivalent  states  g-state  and  r-state,  and 
a given  valid  semantic  relation  operation  r-op,  where  r-op(r-state>  - r-state*, 
that  a composition  of  valid  graph  operations  g-op*  can  be  constructed  so  that 
g-state',  where  g-op* (g-state)  » g-state’,  is  equivalent  to  r-state’.  Consider 
the  following  cases: 

A.  r-state’  * error 

1.  r-op  is  of  the  type  Insert- tuples. 

2.  r-op  is  of  the  type  Delete-tuples. 

B.  r-state’  - error 

The  complete  constructions  follow  along  with  justifications  for  their  correctness. 
Examples  based  of  the  operations  of  previous  chapters  will  be  Interspersed. 
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I.  The  initial  states  g-state0  and  r-state0  are  empty  states.  These  are  clearly 
equivalent  according  to  Theorem  10.1. 

II.  Show  for  any  given  pair  of  valid  equivalent  states  g-state  and  r-state,  and 
a given  valid  semantic  graph  operation  g-op,  where  g-op(g-state)  - g-state*, 
that  a valid  relational  operation  r-op  can  be  constructed  so  that  r-state’,  where 
r-op (r-state)  - r-state’,  is  equivalent  to  g-state’.  Consider  the  following 
cases: 

A.  If  g-state’  • error,  then  consider  the  valid  relational  state  r-state’ 
which  is  equivalent  to  g-state’.  This  is  uniquely  defined  for  the  given 
r-schema. 

The  uniqueness  of  r-state'  is  guaranteed  for  the  following  reasons, 
g-state’  defines  a unique  set  of  base  relations  according  to  Theorem  10.1. 
The  set  of  base  relations  defines  a unique  database  state  r-state*  according 
to  the  constraints  defined  in  step  III  of  Algorithm  9.1  which  express  each 
database  relation  in  terms  of  the  base  relations.  The  uniqueness  of  the 
database  relations  follows  from  the  fact  that  in  each  instance  of  a Case-join 
or  a Predicate-join,  the  columns  which  are  matched  for  the  join  include  the 
identifying  characteristics  of  the  entity  or  association  or  concern. 
Constraints  from  step  II  of  Algorithm  9.1  guarantee  that  the  base  relations 
have  equal  sets  of  identifying  characteristics. 
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1.  If  the  type  of  g-op  is  Insert-ind-entity,  Insert-ind-association  or 
Insert-semantic-unit,  then  let  r-op  be 

operation  (r-state:R-state) 
returns  R-state 

return  Insert-tuples (r-schema.  r-state,  addi  t ion) 
where  addition  - Ar-n.(r-state’  (r-n)  - r-state  (r-n) ) . That  is,  the 

operation  simply  inserts  those  tuples  in  r-state'  which  do  not  exist  In 
r-state.  This  operation  applied  to  r-state  will  result  in  r-state'. 

r-state’  is  by  construction  equivalent  to  g-state'. 

Note  however,  that  there  may  be  tuples  in  r-state  which  do  not 
exist  in  r-state’.  Those  would  be  tuples  with  null  values  where  added 
tuples  take  precedence.  These  tuples  would  automatically  disappear,  but 
the  information  in  them  would  still  be  present  in  the  tuples  which 

replaced  them. 

For  example,  the  graph  state  of  Figure  5.2  is  equivalent  to  the 
relational  state  of  Figure  6.6.  The  graph  state  of  Figure  5.6  is 

equivalent  to  the  relational  state  of  Figure  6.11.  The  relational 

operation  6-4  is  then  equivalent  to  the  graph  operation  5-2  since  6-4 
inserts  the  single  tuple  which  is  in  Figure  6.11  but  not  in  Figure  6.6. 
This  is  a case  in  which  the  inserted  tuple  takes  precedence  over  an 


existing  tuple  in  the  relation. 


2.  If  the  type  of  g-op  is  Delete-ind-entity,  Delete-ind-association  or 
Delete-semantic-unit,  then  let  r-op  be 

operation  (r-stote:R-state) 
returns  R-state 

return  Delete-tuples (r- schema, r-state, delet ion) 
where  deletion  is  defined  by  considering  the  projections  of  r-state* 
into  the  base  relations  of  r-state’.  Also  consider  the  projections  of 
r-state  into  the  base  relations  of  r-state. 

a.  Define  r-state’ j in  the  following  manner.  Consider  the  set  of 
base  relations  which,  for  each  base  relation,  consists  of  the  set 
difference  between  the  base  relation  of  r-state  and  the  corresponding 
base  relation  of  r-state’  (base  relation  of  r-state  - base  relation  of 
r-state’).  The  base  relations  will  have  the  tuples  describing  each 
deleted  entity  or  association.  Add  to  these  base  relations  those  tuples 
in  the  types  1 and  4 base  relations  of  r-state’  describing  each 
entity  which  was  itself  not  deleted,  but  was  in  a deleted  association. 
These  tuples  are  identified  by  matching  the  characteristic  in  the  type 
1 relation  for  the  predicate  corresponding  to  the  association, 
r-state* j is  then  uniquely  defined  by  these  base  relations  and  the 
constraints  which  express  each  database  relation  in  terms  of  the  base 
relations.  Note  that  r-state’ j may  not  itself  be  a valid  state. 


- 229  - 


b.  Let  r-state’2  be  r-state’j  except  for  the  deletion  of  all  of  those 
tuples  which  are  less  than  or  equal  to  any  tuple  in  the 
corresponding  relation  in  r-state*.  That  is,  we  want  to  specify  a 
set  of  tuples  consisting  of  all  of  those  statements  which  are  true  of 
the  objects  deleted  from  the  graph  state  and  which  are  not  true  of 
entities  or  associations  remaining  in  the  graph  state. 

c.  Finally,  for  each  tuple  in  r-9tate'2  which  has  the  be  e:object 
predicate:case  pair  present  along  with  any  other  predicate-.case  pair  in 
the  same  column  group,  if  the  characteristics  in  the  tuple  are  equal 
to  the  characteristics  in  any  of  the  tuples  in  the  types  1 or  4 base 
relations  which  described  entities  which  were  not  themselves  deleted, 
but  were  in  deleted  associations,  then  eliminate  be  e:object  from  the 
column  group's  set  of  predicate:case  pairs.  Let  the  resulting  state  be 
deletion.  This  is  done  because  while  we  may  need  information 
about  these  entities  to  describe  the  deleted  association,  we  do  not 
want  to  delete  the  information  describing  the  non-deleted  entities. 

As  an  example,  consider  operation  5-5.  We  want  to  derive  an 
equivalent  relational  operation.  We  have  claimed  that  operation  6-14  is 
such  an  operation.  These  operations  map  equivalent  states  6. 1 1 and  5.6  into 
states  6.6  and  5.2.  Figures  10.1  and  10.2  show  the  Super vise-operate 
relation  before  (r-state)  and  after  (r-state*)  the  deletion. 

The  difference  between  the  base  relations  of  state  6.11  and  state  6.6  1s 


I 
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Figure  10.1  - Relation  before  Deletion 
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Figure  10.2  - Relation  after  Deletion 


L 


the  single  tuple  shown  in  Figure  10.3  with  empty  sets  for  all  of  the  other 
base  relations.  We  then  add  the  tuples  describing  the  entities  in  the  deleted 


association  to  get  Figure  10.4  (with  empty  sets  for  those  base  relations  not 
pictured).  These  base  relations  then  correspond  to  the  state  shown  in 
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Figure  10.5. 


Supervise-base 


supervise:agent 

supervise  rob ject 

employee 

employee 

name 

name 

names 

names 

G.Wayshum 

B.Belglm 

Figure  10.3  - Difference  In  Base  Relations 


Then  we  delete  from  Figure  10.5  (r-state’^)  the  second  tuple  In 
Supervise-operate  since  it  also  occurs  in  Figure  10.2  (r-state’)  to  get 
Figure  10.6  (r-state^).  Since  in  the  one  remaining  tuple,  the  values  In 
the  column  containing  the  be  employee: object  pair  are  equal  to  those  in  one 
of  the  tuples  added  to  the  base  relations  of  Figure  10.3  (resulting  In 
Figure  10.4),  be  employee: object  is  removed  from  the  heading  of  the  tuple 
to  result  in  Figure  10.7  (deletion).  This  is  the  single  tuple  deleted  In 
operation  6-14. 
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Figure  10.4  - Addition  to  Base  Relations 


Be-employee-base 


Employee-char-base 


be  employee :object 

employee 

name 

age 

names 

years 

G. Wayshum 

50 

B .Belgim 

35 

1 


B.  If  g-op(g-state)  = error,  then  let  r-op  be 

operation  (r-stote:R-state) 
returns  R-state 

return  Insert-tuples (r-schema,  r-state,  l(r-n,  Itlll) 
vhere  r-n  is  the  name  of  any  relation  in  r-schema  and  t is  a tuple  mapping 
each  characteristic  defined  in  the  relation  into  the  null  value. 

This  operation  will  map  any  r-state  into  error.  Hence  r-op  and  g-op 
are  equivalent. 
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Figure  10.5  - Relation  in  r-state’j 
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Figure  10.6  - Relation  in  r-state*2 
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Figure  10.7  - Relation  in  deletion 

III.  Show  for  any  given  pair  of  valid  equivalent  states  g-state  and  r-state,  and 
a given  valid  semantic  relation  operation  r-op,  where  r-op (r-state)  « r-state', 
that  a composition  of  valid  graph  operations  g-op*  can  be  constructed  so  that 
g-state’,  where  g-op*(g-state)  - g-state’,  is  equivalent  to  r-state’.  Consider 
the  following  cases: 

A.  If  r-state’  * error,  then  consider  the  set  of  base  relations  corresponding 
to  r-state  and  the  set  of  base  relations  corresponding  to  r-state*. 

1.  For  the  case  that  r-op  is  of  the  type  Insert-tuples: 
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a.  If,  among  the  set  of  type  1 base  relations  defining  the  restricted 
roles  and  key  characteristics  of  the  predicates,  only  a single  tuple  t 
with  the  predicate  p has  been  added  to  the  base  relations  of  r-state 
to  get  the  base  relations  of  r-state’,  then: 


i.  If  p is  of  the  form  be  e for  some  entity  type  e,  then  a single 
independent  entity  has  been  inserted.  Let  g-op*  be 

operation  (p-stote: g-state) 
returns  g-state 

return  Insert-ind-entity (g-schema, g-state, e- desc) 

where  e-desc.E-id  - (e,t((((be  e, object)  I ,e) ) ) 
and  e-desc.chars  = Utj((l(be  e,  object)  I , e) ) 

where  the  are  the  i tuples  in  the  types  1 and  4 base  relations 
describing  the  characteristics  of  the  added  entity.  By  Theorem 
10.1,  we  know  that  such  tuples  must  also  have  been  added. 
This  results  in  an  equivalent  graph  operation  Inserting  the  entity. 

ii.  If  p is  not  of  the  form  be  e,  then  a single  Independent 
association  has  been  inserted.  Let  g-op*  be 

% 

operation  (g-state:g-state) 
returns  g-state 

return  inseri-ind-association  (g-schema,  g-state,  a-desc) 

where  a-desc.A-id  - (Tp~^  (p) , roles,  t(Tp-^  (p) ) ) 
where  roles  c Role-funct  ion 
and  dom(roies)  » 

ir:R  | Ge:E)  ( ( ( (Tp-*  (p) , Ts  (r) ) I , e)  c dom(t)M 
and  (Vr:R)(r  < domiroies)  d 
(roies(r). chars  - 

t ( ( ( (Tp'1 (p)  ,Ts(r) ) ) , e) ) ) 
and  dom(a-desc.roles)  - domt rotes)  U 

(r:R  | type  3 base  relations  exist  describing  the 
unrestricted  case  Te(r)  of  the  predicate  p) 
and  (Vr:R)  (r  < dom (a-desc. roles) 

o a-desc. roles ( r)  - tt  ( ( HTp-^  (p) , Te (r) ) I , e) ) ) 


where  the  tf  are  the  type  1 tuple  and  the 
tuples  added  to  the  base  relation  for  each 
unrestricted  case.  Such  tuples  must  exist  due  to 
Definition  6.1. 

and  dom(a-desc.chars)  ■ dom(t(Tp“*  (p) ) ) 

U lc:C  | type  2 base  relations  exist  describing 
non-keg  characteristics  of  pi 
and  (Vc:C)(c  < dom(a-desc.chars) 

d a-desc.chars  (c)  « t j (Tp-*  (p) ) (c) ) 
where  tt  is  the  tuple  added  to  the  base  relation  for 
each  non- keg  characteristic.  Such  tuples  must  exist  due 
to  Theorem  10.1.  This  results  in  an  equivalent  graph 
operation  inserting  the  association. 


b.  If  several  tuples  have  an  inserted  into  type  1 base  relations, 
then  several  entities  and/or  associations  have  been  inserted. 

i.  These  tuples  can  be  partitioned  according  to  the  following 
rules: 

1) .  Every  tuple  added  to  be  e type  1 base  relations  for 
which  the  entity  type  e is  a mandatorg  case  of  some  predicate 
p (other  than  be  e)  must  be  in  the  same  partition  as  the 
tuple  in  the  type  1 base  relation  for  p which  has  the 
Identical  entity  type  and  characteristics  for  the  mandatorg 
case. 

2) .  Any  tuple  which  is  not  required  to  be  in  some  partition 
be  rule  1).  above  must  be  in  a partition  which  contains  the 


tuple  as  its  only  member. 
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This  partitioning  is  necessary  because  a single  relational 
operation  may  add  information  to  the  database  state  which 
equivalently  requires  several  semantic  graph  operations.  The 
partitioning  divides  the  added  type  1 tuples  into  sets  which 
specify  either  added  semantic  units,  added  independent  entities  or 
added  Independent  associations. 

ii.  The  equivalent  graph  operation  is  then  defined  as  the 
composition,  in  order,  of  network  operations  adding  each 
independent  entity,  independent  association  or  semantic  unit. 

Let 

9-op*  - g-op1°g-op2t‘  - . . 90~opn 
where  each  g-op ^ satisfies  one  of  the  following  cases: 

1).  If  the  partition  consists  of  a single  tuple  from  a type  1 
base  relation  for  a be  e predicate,  then  an  independent  entity 
has  been  added.  Let  g-opi  be  defined  for  this  tuple  as  in 
l.a.i  with  the  provision  that  tuples  which  define  non-keu 
characteristics  must  have  key  characteristics  identical  to  the 


tuple  in  the  partition. 
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2) .  If  the  partition  consists  of  a single  tuple  from  a type  1 
base  relation  for  a predicate  other  than  be  e,  then  an 
independent  association  has  been  added.  Let  g-opi  be  defined 
for  this  tuple  as  in  l.a.ii  with  the  provision  that  restricted 
case  key  characteristics  and  predicate  key  characteristics  in  the 
tuples  defining  unrestricted  case  and  non-key  characteristics 
must  match  those  of  the  tuple  in  the  partition. 

3) .  If  the  partition  consists  of  more  than  one  tuple,  then  a 

semantic  unit  has  been  added.  Then  let  g-opi  be 

operation (g-state: g-state) 
returns  g-state 

return  Insert-semantic-unit (g-schema,0-state, arp- list) 
where  arg-list  is  defined  as  follows  after  first  arbitrarily 
picking  one  of  the  be  e tuples  in  the  partition  and  beginning 
the  procedure  with  an  empty  arg-list. 

a) .  Generate  an  e-desc  for  the  entity  as  specified  in  step 
l.a.i  and  append  this  to  the  end  of  the  arg-list. 

b) .  For  each  association  in  which  this  entity  type  is 
mandatory  (other  than  be  e),  define  an  a-deec  for  the 
predicate  of  this  association  type  according  to  l.a.ii.  If 
this  a-desc  is  not  already  in  arg-list,  then  add  it  to 


arg-list  with  the  requirement  that  firstly  e-deec’s  are 
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generated  according  to  a.  above  for  each  entity  tuple 
which  is  being  added  in  the  same  partition  and  these 
e-de9c's  are  added  to  arg-list  if  they  are  not  already  in 
the  list.  The  e-desc's  are  added  according  to  the  ordering 
on  role  names  specified  by  next-role.  This  step's  generation 
of  a-desc's  is  to  be  ordered  by  the  ordering  on  association 
names  specified  by  next-assoc-role. 

As  an  example,  consider  the  relational  operation  6-1.  We  want  to 
derive  an  equivalent  graph  operation.  We  have  claimed  that  5-3  is  such 
an  operation.  The  operations  map  the  equivalent  states  of  Figures  5.4 
and  6.7  into  the  equivalent  states  of  Figures  5.7  and  6.8. 

The  base  relations  corresponding  to  state  6.7  are  shown  in 
Figure  10.8.  The  base  relations  corresponding  to  state  6.8  are  shown  in 
Figure  10.9.  The  partitioning  of  the  tuples  added  to  the  type  1 base 
relations  results  in  a single  partition,  a tuple  from  Be-machine-base  and 
a tuple  from  Operate-base,  since  operatesob ject  is  a mandatory  case  for 
the  machine  entity  type.  So  a single  semantic  unit  has  been  added  and 
the  procedure  as  specified  above  is  to  be  followed. 

An  arg-list  must  be  defined  which  first  describes  the  added  entity 
and  then  describes  the  added  association.  The  straightforward  result  of 
this  procedure  for  the  base  relations  of  Figures  10.8  and  10.9  is  the 
arg-list  used  in  operation  5-3. 
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Figure  10.9  - Base  Relations  after  Insertion 
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2.  For  the  case  that  r-op  is  of  the  type  Delete-tuples: 

a.  If,  among  the  set  of  type  1 base  relations  defining  the  restricted 
roles  and  Key  characteristics  of  the  predicates,  only  a single  tuple  t 
with  the  predicate  p has  been  deleted  from  the  base  relations  of 
r-state  to  get  the  base  relations  of  r-9tate’,  then 

1.  If  p is  of  the  form  be  e for  some  entity  type  e,  then  a single 
independent  entity  has  been  deleted.  Let  g-op*  be 

operation  (g-state:  g-state) 
returns  g-state 

return  Delete  ind -entity  (g-schema,  g-state, 

(e.t((l(be  e,  ob  ject)  I , e) ) ) ) 

This  specifies  the  type  and  the  key  characteristics  of  the  entity  to 
be  deleted. 

ii.  If  p is  not  of  the  form  be  e,  then  a single  independent 
association  has  been  deleted.  Let  g-op* 

operation  (g-state: g-state) 
returns  g-state 

return  Delete-ind-association  (g-schema,  g-state,  a- i d) 

where  a-id  - (Tp_*  (p) , roles,  t(Tp_*  (p) ) ) 
where  roles  t Role-function 
and  dom(rofes)  - 

(r:R  | (3e:E)  ( ( ( (Tp_1  (p) , Ts(r).)  I , e)  c dom(t)>» 
and  (VrsRHr  t dom(roles)  d 
(roles  (r) .chars  - 

t(  ( KTp-1  (p)  ,Ts(r)  )l  ,e) ) ) 

This  identifies  the  association  to  be  deleted. 
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b.  If  several  tuples  have  been  deleted  from  type  1 base  relations, 
then  several  entities  and/or  associations  have  been  deleted. 

1.  These  tuples  can  be  partitioned  as  specified  in  l.b  for  the 
Insert-tuples  case.  Similar  to  that  case,  it  means  here  that  some 
combination  of  semantic  units,  independent  entities  and 
independent  associations  have  been  deleted. 

ii.  The  equivalent  graph  operation  is  then  defined  as  the 
composition,  in  order,  of  network  operations  deleting  each 
semantic  unit,  independent  association  or  independent  entity. 

Let 

g-op*  - g-op1°g-op2°  . . . °g-opn 
where  each  g-opi  satisfies  one  of  the  following  cases: 

1) .  If  the  partition  consists  of  a single  tuple  from  a type  1 
base  relation  for  a be  e predicate,  then  an  independent  entity 
has  been  deleted.  Let  9-0 be  defined  for  this  tuple  as  in 
2.a.l. 

2) .  If  the  partition  consists  of  a single  tuple  from  a type  1 
base  relation  for  a predicate  other  than  be  e,  then  an 
independent  association  has  been  deleted.  Let  p-opj  be 
defined  for  this  tuple  as  in  2.a.ii. 
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3).  If  the  partition  consists  of  more  than  one  tuple,  then  a 
semantic  unit  has  been  deleted.  Then  let  p-opj  be 

operation  (p-s tote: g-state) 
returns  g-state 

return  Delete-semantic-unit (g-scheaa, 9-state, a- i d) 
where  a- id  is  defined  as  in  2.a.ii  for  any  arbitrarily  chosen 
tuple  in  the  partition  representing  an  association  in  the 
partition  (i.e.  a predicate  other  than  be  e).  This  is  sufficient 
to  define  a semantic  graph  operation  which  will  delete  the 
entire  semantic  unit. 

B.  If  r-state’  « error,  then  let  g-op*  be 

operation  (g-state: g-state) 
returns  g-state 

. 

return  Insert-ind-entity  (g-schema,  g-state,  e-desr) 
where  e-desc  t E-desc  has  a chars  component  which  maps  some 

j ] 

characteristic  into  a domain  element  (d,9tr)  such  that  g-schena.sd(d)  (str)  - 
false.  That  is,  the  value  is  not  a valid  element  of  the  domain. 

This  operation  will  map  any  g-state  into  error.  Hence  g-op*  and  r-op 
are  equivalent. 
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This  completes  the  inductive  proof  of  application  model  equivalence.  Since 
it  is  applicable  to  any  pair  of  application  models  consisting  of  a semantic  graph 
application  model  and  a restricted  semantic  relation  application  model  whose 
schema  was  derived  from  the  graph  model,  data  model  equivalence  has  been 
shown  between  the  semantic  graph  data  model  and  the  restricted  semantic 
relation  data  model. 

This  ends  the  proof  of  Theorem  10.2. 

10.3  Discussion 

At  this  point  we  will  summarize  the  basic  ideas  behind  the  constructions 
presented  in  the  proof  of  Theorem  10.2.  Part  1 simply  states  the  equivalence  of 
the  empty  states.  In  part  II  a construction  is  presented  to  find  a relational 
operation  equivalent  to  a given  graph  operation.  We  know  which  relational 
state  is  equivalent  to  the  graph  state  resulting  from  the  graph  operation,  so  we 
work  backwards  from  this  to  find  the  necessary  relational  operation. 

In  the  case  in  which  entities  or  associations  were  added  to  the  graph  state, 
an  Insert-tuples  operation  is  constructed  which  simply  adds  those  tuples  present  in 
the  desired  relational  state  but  not  present  in  the  initial  relational  state.  In  the 
case  that  entities  or  associations  were  deleted  from  the  graph  state,  a Delete-tuples 
operation  is  constructed  which  deletes  all  of  those  statements  true  in  the  initial 
relational  state  but  not  true  in  the  desired  state.  Both  the  Insert-tuples  and 
Delete-tuples  operations  make  use  of  the  partial  ordering  defined  on  tuples. 
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In  part  III  a construction  is  presented  to  find  a graph  operation  equivalent 
to  a given  relational  operation.  The  relational  operation  is  found  by  examining 
the  base  relations  of  the  two  relational  states.  From  the  base  relations, 
descriptions  are  derived  of  the  entitles  and/or  associations  which  have  been 
inserted  or  deleted.  A composed  operation  is  necessary  if  more  than  one 


independent  entity,  independent  association  or  semantic  unit  is  to  be  Inserted  or 
deleted. 

Note  that  the  construction  in  part  III  could  have  been  done  as  in  part  II  by 
first  finding  the  graph  state  which  is  the  desired  result  of  the  graph  operations 
and  then  working  backwards  from  there  to  find  the  necessary  graph  operation. 
The  construction  presented  was  chosen  because  it  better  illustrated  how  the 
information  about  the  state  is  distributed  throughout  the  base  relations. 

We  will  now  make  several  observations  concerning  these  constructions. 
Firstly,  recall  that  state-dependent  equivalence  allows  the  use  of  composed 
operations.  For  the  two  data  models  studied  here,  we  needed  to  construct 
composed  operations  only  for  the  semantic  graph  model  since  a single  relational 
operation  can  Insert  an  arbitrary  number  of  entities  and  associations. 

Secondly,  we  can  see  that  the  state-dependence  required  for  the 
constructions  arises  from  the  nature  of  the  semantic  relation  data  model.  The 
proper  form  of  an  insertion  or  deletion  is  dependent  upon  the  database  state 
being  modified. 

Finally,  it  should  be  noted  that  the  handling  of  operations  which  result  in 
the  error  state  is  not  completely  satisfactory.  In  particular,  all  errors  are 


T 
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treated  identically.  An  actual  implementation  would  certainly  distinguish 
different  hinds  of  errors. 
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Chapter  11  - Conclusions 

11.1  Summary 

The  motivation  for  this  research  was  to  gain  an  understanding  of  data 
model  equivalence  properties.  Such  an  understanding  is  necessary  if  one  wishes 
to  implement  any  of: 

1.  An  ANSI-like  database  interface  providing,  for  a single  database, 
multiple  views  based  on  different  data  models. 

2.  A common  interface  to  several  database  systems  which  are  defined  in 
terms  of  different  data  models. 

3.  The  transference  of  both  databases  and  programs  between  systems  based 
on  different  data  models. 

We  were  also  also  interested  in  gaining  insight  into: 

4.  The  evaluation  of  data  model  expressiveness. 

5.  The  relational  vs.  network  data  model  "controversy''. 


To  convincingly  attack  these  issues,  the  notion  of  a semantic  data  model 
was  introduced.  Specifically,  the  semantic  graph  and  semantic  relation  data 
models  were  defined  in  Chapters  2,  5 and  6.  These  data  models  are  meant  to 
be  semantic  versions  of  the  network  and  relational  data  models.  The  structures 
used  in  semantic  data  models  are  meant  to  have  clear  interpretations  in  terms  of 
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the  applications  being  modelled.  The  semantics  of  the  two  data  models  defined 
in  this  thesis  are  expressed  by  natural  language  case  grammars. 

Each  object  in  a semantic  graph  database  state  is  meant  to  represent  an 
object  in  the  state  of  the  application  being  modelled.  The  operation  types 
allowed  are  the  insertion  and  deletion  of  entities,  associations  or  semantic  units. 

A semantic  relation  database  state  consists  of  statements  (tuples)  describing 
the  state  of  the  application.  The  allowed  operation  types  are  the  insertion  and 
deletion  of  sets  of  such  statements.  The  insertion  and  deletion  operations  are 
defined  in  terms  of  a partial  ordering  of  statements. 

Thpse  two  data  models  were  defined  in  terms  of  a formal  framework  for 
data  model  definition  as  presented  in  Chapter  4.  This  framework  is  based  on 
the  notions  of  application  model,  schema,  operation  type,  database  state  and 
database.  The  framework  is  applicable  to  syntactic  data  models  as  well  as 
semantic  data  models.  However,  the  task  of  comparing  data  models  is  claimed  to 
be  easier  for  semantic  data  models. 

Chapter  7 specifies  the  various  equivalence  properties  of  interest.  Data 
model  equivalence  is  based  on  the  notions  of  application  model  equivalence, 
operation  equivalence  and  database  state  equivalence.  It  was  shown  that 
composed  operation  and  state  dependent  equivalence  must  be  considered  in 
addition  to  isomorphic  equivalence.  The  choice  between  isomorphic,  composed 
operation  and  state  dependent  equivalence  for  a specific  multiple  data  model 
system  impacts  the  types  of  implementations  which  can  be  used. 

Due  to  the  generality  of  the  semantic  relation  data  model  constraints,  the 
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semantic  graph  and  semantic  relation  data  models  are  not  totally  (state 
dependent)  equivalent.  A data  model,  the  restricted  semantic  relation  data 
model,  which  is  totally  equivalent  to  the  semantic  graph  data  model  was  then 
defined. 

Chapter  8 showed  how  to  define  all  but  the  constraint  component  of 
restricted  semantic  relation  application  models.  The  essence  of  this  procedure 
was  the  definition  of  a semantic  relation  as  containing  statements  describing  a 
particular  portion  of  a semantic  graph  application.  It  is  possible  to  define  several 
relational  application  models  for  a single  graph  application  model.  Such  a 
feature  seems  appropriate  for  database  systems  providing  differing  views  of  the 
same  application.  It  is  also  possible  that  a semantic  relation  application  model  Is 
redundant,  containing  the  same  information  in  several  relations.  The  user  may 


desire  to  combine  statements  concerning  the  application  in  many  different  ways. 

The  restricted  semantic  relation  constraint  definition  algorithm  was 
presented  in  Chapter  9.  Five  types  of  constraints  are  used:  Equal,  Equate,  Subset, 
Non-null  and  Col-func.  Constraints  are  stated  in  terms  of  a semantic  relational 
algebra.  The  operations  in  this  algebra  are  Case-join,  Predicate-join,  Conjunction  and 
Project.  It  was  observed  that  a large  number  of  relational  constraints  are 
required  for  even  a small  application. 

The  state  dependent  equivalence  of  the  semantic  graph  and  semantic 
relation  data  models  was  proved  in  Chapter  10.  Equivalence  was  shown  for 
each  pair  of  application  models  consisting  of  a semantic  graph  application  model 
and  a semantic  relation  application  model  whose  schema  was  derived  from  the 
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graph  schema  using  the  algorithms  presented  in  Chapters  8 and  9.  Equivalence 
for  each  such  pair  of  application  models  implied  equivalence  of  the  two  date 
models.  Application  model  equivalence  was  shown  by  presenting  algorithms  to 
construct  equivalent  operations  for  the  application  models. 

11.2  Relational  vs.  Network:  Some  Observations 

The  various  arguments  in  favor  of  either  the  relational  or  network 
syntactic  data  models  have  counterparts  in  terms  of  the  semantic  data  models. 
We  feel  that  our  examination  of  the  equivalence  problems  of  the  semantic  data 
models  gives  some  insight  for  considering  the  validity  of  these  arguments.  We 
will  comment  on  this  "controversy". 

Firstly,  we  can  state  that  when  restricting  the  data  models  to  equivalent 
graph  and  relational  application  models,  the  "expressive  powers"  of  the  models 
are  equivalent.  Each  state  and  each  operation  in  one  model  has  an  equivalent 
counterpart  in  the  other  data  model.  So  comments  on  whether  the  graph  or 
relational  view  is  "better"  are  strictly  subjective  unless  the  comparison  is 
dependent  upon  the  existence  of  relational  application  models  with  no  equivalent 
graph  model.  We  will  discuss  this  facet  of  the  comparison  later. 

We  can  look  at  two  major  facets  of  the  capabilities  provided  by  a database 
system  - retrieval  and  update.  Recall  from  Chapter  3 that  one  of  the  main 


distinctions  drawn  in  the  literature  between  network  and  syntactic  relational 
database  systems  is  in  terms  of  set-oriented  or  "one-at-a-tlme"  retrieval 
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languages.  As  pointed  out  in  Chapter  3,  this  is  really  an  argument  concerning 
efficiency  of  implementation  and  either  type  of  language  could  be  used  with 
either  type  of  data  model.  The  same  points  hold  for  the  semantic  data  models. 

Proponents  of  the  syntactic  relation  model  claim  that  the  relational  model 
is  much  easier  for  the  user  to  comprehend  since  it  presents  only  a single  type 
of  structure,  relations,  while  the  network  models  present  several  different  types 
of  structures  [Date  and  Codd],  We  can  see  from  examining  the  formal  state 
specifications  of  the  semantic  graph  and  semantic  relation  data  models  that  the 
same  "simplicity"  of  the  relational  view  is  reflected  in  Its  less  complex 
specification. 

Advocates  of  the  syntactic  relational  model  also  claim  that  update 
operations  are  also  quite  simple  and  clear  because  there  are  only  two  simple 
types  of  operations  - adding  or  deleting  tuples  [Date  and  Codd].  We  can  see  in 
our  semantic  models  that  the  formal  definitions  of  the  relational  operation  types 
are  certainly  less  complex  than  the  definitions  of  the  graph  operation  types. 
But  this  argument  completely  ignores  the  complexity  of  the  large  number  of 
constraints  necessary  to  maintain  the  semantic  integrity  in  a relational  database. 
An  attempt  at  performing  updates  to  a semantic  relation  database  without  a 
knowledge  of  the  constraints  could  be  a difficult  process.  In  the  graph  model, 
knowing  what  type  of  operation  to  use  recognizes  that  certain  types  of 
predefined  requirements  must  be  satisfied  by  the  update  operation.  One  way  or 
another  the  complexity  of  the  application  must  be  represented  in  the  application 


model.  The  choice  of  which  data  model  to  use  can  significantly  alter  where 
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the  complexity  is  to  reside. 

11.3  Foundation  for  a User  Interface 

As  just  stated,  one  way  or  another  the  complexity  of  the  application  must 
be  represented  in  the  application  model;  the  choice  of  which  data  model  to  use 
can  significantly  alter  where  the  complexity  is  to  reside.  While  the  semantic 
relation  data  model  provides  a nice  tabular  view  of  the  database  state,  it  seems 
poor  for  performing  updates.  The  semantic  graph  data  model  allows  an  easier 
understanding  of  updates,  but  would  be  quite  unwieldy  for  retrievals  other  than 
those  involving  a small  amount  of  data.  However,  the  ability  to  support 
equivalent  relational  and  graph  application  models  accessing  a shared  database 
would  allow  the  best  of  both  worlds  - a simple  relational  view  for  retrieval 
and  a graph  model  for  updating. 

The  data  model  equivalence  properties  proved  in  this  thesis  for  the 
semantic  graph  and  restricted  semantic  relation  data  models  guarantee  that  this 
ability  could  be  provided  in  a single  system.  The  system  could  also 
accommodate  those  users  desiring  retrievals  in  terms  of  the  semantic  graph 
model  and  updates  in  terms  of  relations.  We  propose  here  the  construction  of  a 
database  system  which  provides  as  a user  interface  a dual  semantic  data  model 
consisting  of  the  semantic  graph  and  restricted  semantic  relation  data  models. 

For  the  user  of  this  system,  database  design  would  begin  with  the 


t 


construction  of  a semantic  graph  schema.  Based  on  this  schema,  a procedure 
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similar  in  principle  to  Algorithm  8.1  would  be  followed  to  "encircle"  portions  of 
the  graph  schema.  Each  such  portion  would  define  a relation. 

We  saw  earlier  in  this  thesis  that  a natural  language  sentence  template  can 
be  associated  with  each  relation.  We  envision  that  these  templates  would  be 
automatically  generated  for  each  relation.  This  automatic  generation  is  much 
like  the  generation  of  sentences  from  "deep  structure"  representations  In 
artificial  intelligence  natural  language  systems  [Bruce].  The  generation  of  the 
template  would  serve  as  feedback  during  the  design  process.  The  designer 
would  examine  the  templates  to  confirm  that  the  relations  contain  the  desired 
information  in  the  proper  form.  After  the  design  stage,  the  templates  would  be 
displayed  with  the  relations.  The  templates  do  not  replace  the  relation  headings. 
Rather,  they  show  the  same  information  in  a different  form. 

Since  there  are  several  relational  application  models  equivalent  to  each 
graph  model,  we  could  then  extend  such  a system  to  let  different  users  see 


different  relational  versions  of  the  same  application.  A user  can  retrieve  from 
and  operate  on  the  database  state  in  terms  of  either  semantic  graphs  or  semantic 
relations. 

While  the  specification  of  retrieval  languages  has  not  been  essential  to  this 
thesis,  the  design  of  such  languages  would  be  necessary  for  the  proposed 
system.  [Deheneffe  and  Hennebert]  present  a retrieval  and  update  language  for 
a semantic  graph  data  model  similar  to  the  one  presented  in  this  thesis.  We  are 
not  aware  of  any  retrieval  language  appropriate  for  a semantic  relation  data 
model  as  presented  in  this  thesis.  Just  as  the  relational  algebra  is  a base  for  a 


I 


- 256  - 


retrieval  language  for  the  syntactic  relational  model  [Codd  72b],  the  operations 
used  in  constraint  definition  in  this  thesis  might  be  considered  an  appropriate 
starting  point  for  an  algebraic  language. 

[Chen]  provides  brief  examples  of  a retrieval  language  for  his  data  model 
similar  to  the  mapping  language  of  [Chamberlin  and  Boyce]  used  for  syntactic 
relational  systems.  The  sentence  templates  would  be  particularly  appropriate  for 
retrieval  in  a manner  similar  to  Query-by-Example  [Zloof].  For  updates,  a 
fill-in-the-blank  approach  using  the  sentence  templates  seems  natural.  Any  help 
provided  through  research  in  the  use  of  natural  language  as  a database  interface 
would  obviously  be  beneficial. 

Semantic  relations  are  assumed  to  always  contain  all  possible  statements  of 
the  form  specified  for  the  relation.  If  we  allow  retrievals  to  dynamically  create 
new  relations,  we  must  be  sure  to  properly  modify  the  relation  heading  and 
template.  For  example,  consider  the  relation  describing  employees  in 
Figure  11.1.  Now  assume  we  want  to  retrieve  the  statements  from  this  relation 
describing  those  employees  whose  age  is  less  than  40  years.  To  simply  present 
to  the  user  the  relation  in  Figure  11.2  would  be  misleading  unless  a clear 
distinction  was  made  between  database  relations  and  retrieved  relations.  The 
user  should  be  aware  of  the  fact  that  a particular  relation  contains  only  a 
subset  of  the  possible  statements  fitting  its  template. 

We  propose  investigating  the  possibility  of  reflecting  the  semantics  of  the 
retrieval  in  the  relation  heading  as  shown  in  Figure  11.3  for  this  example. 
Figure  11.4  shows  the  result  of  a projection  of  the  employee  names  in 
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"There  Is  an  employee  with  name  and  age 


be  employee  -.object 
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T.Manhart 
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Figure  11.1  - Semantic  Relation 


•There  Is  an  employee  with  name  and  age 
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Figure  11.2  - Result  of  Retrieval 
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Figure  11.3.  The  result  of  a projection  of  the  employee  ages  In  Figure  11.3 
results  in  Figure  11.5.  Note  the  sentence  templates  in  these  figures.  Exploring 
the  details  of  this  approach  is  an  area  for  future  research. 


Figure  1 1.3  - Retrieval  Result  Showing  Semantics  of  Retrieval 


We  have  presented  here  only  an  incomplete  description  of  the  dual 
semantic  data  model  interface.  The  directions  outlined  seem  natural  extensions 
of  the  research  in  this  thesis. 

The  ability  to  define  the  dual  data  model  also  suggests  several  other 
capabilities  which  can  be  provided  In  a database  system.  We  could  present  a 
strictly  relational  model  to  one  user  and  a strictly  graph  model  to  a different 
user.  The  database  system  can  maintain  one  of  the  components  of  the  dual 
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"There  Is  an  employee  aged  less  than  40  with  name 


Figure  11.4  - Projection  of  Figure  11.3 


"There  Is  an  employee  aged  less  than  40  with  age 


Figure  11.5  - Projection  of  Figure  11.3 


database  state  hidden  from  any  particular  user. 
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11.4  Inferential  Knowledge  Representation  Systems 

We  ask  here  whether  the  data  model  equivalence  framework  presented  in 
this  thesis  is  useful  for  the  study  of  "knowledge  representation  language** 
databases  used  with  inferential  artificial  intelligence  systems.  For  example,  one 
might  wonder  if  a particular  OWL  Linguistic  Memory  System  database 
[Szolovits  et.  al.]  is  equivalent  to  a particular  FRL  (Frame  Representation 
Language)  database  [Roberts  and  Goldstein].  The  answer  to  the  question  of 
usefulness  is  a qualified  yes. 

If  an  adequate  definition  of  database  state  equivalence  can  be  given  for  the 
data  models  of  concern,  then  the  rest  of  the  framework  can  be  used.  However, 
the  definition  of  database  state  equivalence  for  two  such  knowledge 
representation  languages  seems  to  have  several  obstacles. 

Firstly,  since  these  data  models  are  designed  to  be  used  with  inferential 
systems,  there  may  be  structures  in  the  database  state  which  are  oriented 
towards  a specific  inference  mechanism.  For  example,  in  OWL  Linguistic 
Memory  there  are  "multi-purpose"  links  whose  meaning  is  dependent  upon  the 
workings  of  the  OWL  interpreter  which  uses  the  database.  The  necessity  of 
considering  the  inference  mechanisms  for  determining  the  meaning  of  the 
database  is  not  encompassed  by  the  framework  used  in  this  thesis. 

Secondly,  a great  "richness"  of  representation  is  desired  for  such  systems. 
In  particular,  it  is  desired  to  represent  procedural  knowledge.  The  mechanism 
used  for  such  procedural  representation  may  make  it  Impossible  to  give  clear 


interpretations  of  such  procedures  at  a semantic  level.  This  is  likely  to  be  the 
case  in  FRL  which  allows  the  insertion  of  arbitrary  LISP  procedures  into  the 
database. 

Finally,  to  define  equivalence  between  data  models  there  must  be  at  some 
level  a common  view  of  the  world.  In  this  thesis,  both  the  semantic  graph  and 
semantic  relation  data  models  are  based  on  a simple  entity-association  view  of 
the  world.  It  is  an  open  question  whether  enough  commonality  exists  between 
more  complex  knowledge  representation  languages.  [Smith]  includes  an 
exploration  of  the  issues  involved  with  this  question. 

11.5  Other  Applications  and  Areas  for  Future  Research 

1.  A formal  framework  for  defining  data  models  has  been  presented.  Using  this 
framework,  several  different  types  of  data  model  equivalence  were  defined. 
This  framework  and  these  equivalence  definitions  should  be  used  to  evaluate  the 
suitability  of  different  data  models  for  the  different  ANSI  schema  levels. 
Insight  can  be  gained  into  both  the  data  models  themselves  and  into  the  type  of 
implementation  needed. 

The  considerations  involved  in  actually  Implementing  an  ANSI-like 
architecture  must  include  an  examination  of  the  equivalence  issues  discussed  in 
this  thesis.  Whether  the  defined  equivalences  are  state  dependent  or  not  would 
have  a large  effect  on  the  types  of  implementations  allowable.  The  operation 


equivalence  mappings  from  the  external  to  internal  levels  should  be  analyzed  as 
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a guide  for  implementation  efficiency  considerations.  The  same  types  of 
equivalence  mappings  must  be  involved  in  the  transportation  of  a database  and 
associated  programs  from  one  database  system  to  another. 

There  should  be  more  development  of  semantic  data  models  such  as  the 
semantic  relation  model  for  the  external  schema  level.  While  it  may  be 
necessary  to  support  existing  syntactic  data  models  at  the  external  level,  all 
users  should  not  be  deprived  of  important  semantic  information  which  helps 
them  better  understand  how  the  application  is  modelled. 


2.  The  complement  of  the  previous  suggestion  for  research  is  an  examination  of 
the  issues  involved  in  providing  a common  interface  to  differing  database 
systems  defined  in  terms  of  different  data  models.  This  problem  is  most 
applicable  in  attempting  to  interface  existing  independent  systems  into  a 
distributed  system  or  network.  A particularly  important  (and  perhaps  vexing) 
feature  of  this  problem  is  that  most  existing  database  systems  are  based  on 
syntactic  data  models.  Hence,  the  problem  of  defining  state  equivalence  is 
difficult. 

One  interesting  approach  to  this  problem  would  be  using  the  dual  semantic 
data  model  interface  previously  described  as  the  common  interface.  The  database 
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3.  We  saw  in  Chapter  8 that  there  are  relational  application  models  which  have 
no  equivalent  graph  application  models,  yet  which  seemed  to  specify  potentially 
desiralne  applications.  It  seems  worthwhile  to  explore  how  to  define  an 
extendible  semantic  graph  model  in  which  new  types  of  semantic  objects  and 
appropriate  operations  can  be  specified.  How  this  would  relate  to  the  derivation 
of  equivalent  constraints  would  need  to  be  explored.  The  concept  of  abstract 
data  types  [Liskov  et.  al.]  seems  to  be  a good  framework  for  such  an  extendible 
model.  [Weber  76]  and  [Weber  78]  show  work  relating  abstract  data  types  to 
database  problems. 

An  alternative  to  defining  new  operations  would  be  to  put  constraints, 
similar  to  those  of  the  relational  models,  as  additional  requirements  to  be 
satisfied  by  semantic  graph  operations.  [Chen]  proposes  this,  but  we  feel  that 
this  is  not  in  keeping  with  the  hature  of  the  semantic  graph  model  in  which 
allowable  types  and  operations  should  directly  model  the  application. 

I 

4.  We  have  avoided  implementation  questions,  but  they  must  be  answered  before 
any  data  model  is  actually  used.  The  semantic  graph  and  semantic  relation  data 
models  do  not  seem  to  introduce  any  particular  problems  different  from  those  of 
previously  proposed  data  models. 

However,  the  concept  of  basing  a database  system  on  a dual  data  model 
adds  another  level  of  complexity.  The  various  trade-offs  involved  should  be 
identified  and  explored.  Note  that  we  do  not  need  to  know  whether  the  dual 


data  model  is  "really"  implemented  in  terms  of  graphs  or  relations.  Such 
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concepts  art*  ab.sent  from  data  independent  data  models.  One  implementation 
methodology  which  claims  to  be  good  for  both  network  and  relational  syntactic 
data  bases  is  [Tsichritzis]. 

There  are  also  the  questions  of  concurrent  access  to  a shared  database  and 
distributed  database  systems.  We  would  want  to  guarantee  that  each  user 
always  sees  a database  system  acting  consistent  with  the  definition  of  the  data 
model  on  which  it  is  based.  [Hawryszkiewycz]  uses  a formal  model  of 
computation  to  prove  the  correctness  of  a theoretical  implementation  of  a 
syntactic  relational  database  system  without  constraints.  This  topic  also  needs  to 
be  explored  in  terms  of  semantic  data  models  as  presented  in  this  thesis.  The 
interaction  of  the  semantics  of  the  database  state  and  the  choice  of  "locking" 
units  for  concurrent  users  should  be  explored.  The  significance  of  database 
Semantics  to  the  implementation  of  distributed  database  systems  is  completely 
unexplored. 


5.  We  can  define  the  equivalence  of  data  models  only  after  we  understand  the 
semantics  of  the  applications  representable  in  the  data  model.  For  the  same 
reason,  efforts  aimed  at  finding  general  methods  of  transferring  data  from  one 
database  system  to  another  should  be  based  on  semantic  data  models.  Work  in 
the  area  of  data  translation  (such  as  [Sibley  and  Taylor])  has  been  based  on 
syntactic  data  models  and  has  largely  ignored  the  problems  of  transferring 
programs  from  one  database  system  to  another  which  requires  the  derivation  of 
equivalent  operations  [Dennis], 
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This  thesis  has  shown  how  equivalent  database  states  and  operations  can  be 
found  for  the  two  data  models  presented.  The  impact  of  equivalent  semantic 
data  models  on  the  problem  of  data  translation  should  be  explored. 

6.  The  examples  used  in  this  thesis  have  been  relatively  simple.  In  particular, 
note  that  only  two  cases  (roles)  have  been  used  in  expressing  associations:  agent 
and  object.  The  expression  of  more  complex  associations  might  involve  other 
cases  [Bruce]  such  as  result,  instrument,  source  and  goal.  Attempts  should  be 
made  to  express  other  applications,  preferably  "real"  ones,  in  terms  of  the 
semantic  graph  and  semantic  relation  data  models. 

7.  The  entity-association  model  used  as  a semantic  basis  in  this  thesis  is 
relatively  simple.  We  will  mention  two  specific  areas  in  which  extensions 
might  be  desirable. 

Firstly,  this  model  includes  no  specific  concept  of  time.  Time  could  be 
expressed  as  entity  or  association  characteristics,  but  a more  sophisticated 
approach  might  be  useful.  [Bradly]  and  [Bubenko]  present  data  models 
incorporating  time.  There  has  also  been  pertinent  work  in  the  artificial 
intelligence  area  such  as  [Kahn]  and  [Thompson  and  Thompson], 

Secondly,  the  models  used  in  this  thesis  assume  that  entity  sets  are 
disjoint.  Data  models  which  do  not  make  this  assumption  are  described  in 
[Palmer]  and  [Smith  and  Smith].  Again,  researchers  in  artificial  Intelligence 
knowledge  representation  such  as  [Szolovits  et.  al.]  and  [Fahlman]  have  explored 
this  topic. 
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1 1 .6  Conclusion 

There  has  been  a twofold  theme  throughout  this  thesis:  the  importance  of 
representing  application  semantics  in  a database  system's  data  model  and  the 
necessity  of  being  able  to  derive  equivalent  operations  in  two  data  models 
which  purport  to  model  the  same  class  of  applications.  The  themes  are 
intertwined  because  only  when  it  is  agreed  that  the  application  semantics 
represented  in  two  different  data  models  are  comparable  can  there  be  any 
meaningful  definitions  of  equivalence. 

We  have  presented  in  this  thesis  a formal  framework  for  defining  data 
models  and  data  model  equivalence.  Two  semantic  data  models  have  been 
presented  and  their  various  equivalences  within  the  framework  were  shown. 
These  included  the  concepts  of  state  equivalence,  operation  equivalence, 
application  model  equivalence  and  data  model  equivalence.  These  various  types 
of  equivalences  are  not  mere  artifacts  of  formality,  but  rather  are  Important 
properties  which  must  be  investigated  whenever  asserting  that  two  data  models 
are  equivalent.  This  thesis  was  intended  to  convey  the  idea  that  a formal 
approach  is  necessary  to  truly  understand  the  problems  involved. 

The  two  data  models  have  merit  in  that  they  show  how  application 
semantics  can  be  embedded  in  equivalent  data  models  which  are  related  in  form 
to  two  widely  discussed  types  of  (syntactic)  data  models:  network  and  relational. 
These  data  models  are  appropriate  for  database  systems  which  propose  to  allow 


different  users  to  have  views  of  the  database  most  appropriate  for  their 
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application.  Moreover,  it  is  shown  how  equivalent  application  models  can  be 
defined  and  how  equivalent  operations  can  be  found  for  these  equivalent 
application  models. 

It  is  not  claimed  that  the  data  models  described  In  this  thesis  are  the  "be 
all  and  end  all"  of  data  models.  But  it  is  hoped  that  this  thesis  will  make  the 
database  system  designer  consider  the  issues  of  application  semantics,  data  model 
equivalence,  and  their  interrelations. 
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Appendix  1.  The  Semantic  Network  Data  Model: 
Formal  Specifications 


| 


G-schema  « (E, 

0, 

r: t (A  x R)  -♦  E], 
c: I ( (EuA)  x C)  - D), 

fr: t ( A x R)  - (restrict: (restricted, unrestricted!, 

mand:  (mandatory, non-mandatory! ) ] , 
fc:(((EUA)  x C)  -*  (key,non-keyl 1, 
sd:(0  -»  (Str  -»  Boolean]] 

R. 

C. 

Str  ) 


G-state  **  (f:Pouerset  (E- id), 

A:Pouerset  (A- id), 

D:Pouerset  (Domain-element), 
r:I (A- id  x R)  -♦  E-id], 
c:((ld  x C)  -*  Domain-element]  ) 


Domain-element 
_ Char-funct ion 
E-id 

Role-funct ion 
A-  i d 
Id 

Boo  I ean 

Arg- I i st 

E-desc 

A-desc 

E 

A 

D 

R 

C 

Str 


- (typedD,  value:Str) 

- (C  -♦  Domain-element] 

- (type:E,  chars:Char-funct  ion) 

- (R  - E-id] 

- (type:A,  roles:Ro  I e- function,  chars:Char-funct  ion) 

- A-id  U E-id 

- (true,  faisel 

- (E-desc  u A-desc)*  such  that  each  element  of  the  list  is  unique 

- (E-id:E-id,  chars:Char-function) 

- (A-id:A-id,  rolesrflole-funct ion,  chars:Char- function) 

- entity  set  names 

- association  set  names 

- domain  names 
« roles 

- characteristics 

- character  strings 


Figure  AI.l  - Type  Specifications 


. 
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first-role:  l (A  x G-schema)  -*  R) 
next-role:  l (A  x G-schema  x R)  -•  R] 
last-role:  l (A  x G-schema)  -*  R) 

These  three  functions  specify  the  ordering  on  the  role  set  edges 
for  a specific  association  set  node  in  the  database  state. 

first-assoc-role:  (G-schema  -*  (A  x R) ) 
next-assoc-role:  t (G-schema  x A x R)  -*  (A  x R)1 
last-assoc-role:  (G-schema  -•  (A  x R ) ) 

These  three  functions  specify  the  ordering  of  role  set  edges  In 
the  database  schema. 

first-aid-role: (G-state  -•  (A-id  x R)J 

next-aid  role: ( IG-state  x A-id  x R)  -»  (A-id  x R)  1 

last  aid  role:(G-state  -*  (A-id  x R ) ) 

These  three  functions  specify  the  ordering  of  role  edges  In  the 
database  state. 


For  the  three  groups  of  functions  above,  we  assume  there  is  a "dummy” 
last  element  in  each  ordering  which  does  not  represent  an  actual  edge. 


image:  (lx  -•  >•)  -•  Pouerset  (y) ) 
domdlx  -*  >0  -•  Pouerset(x)] 

These  functions  return  the  domain  and  Image  of  a function. 


X : t ( x x ( y -*  z)  ) -♦  ( I x x y)  ■*  z) 

This  function  combines  a constant  into  the  domain  of  a function. 

| | s U (x  -♦  yl  x Powerset  (x) ) -»  lx  -•  y]  ] 

This  function  returns  a function  which  Is  the  same  as  its  first 

argument  restricted  to  the  domain  specified  by  the  second  argument. 

head:x*  -♦  x 
tailix*  -•  x* 

These  functions  return  the  first  element  and  the  remainder  of  a list 
respectively. 


Figure  A1.Z  - Built-in  functions 


- 277  - 


valid-entity (schema:G-schema,  state:G-state,  e-tdf-id,  e-chars:Char-funct  ion) 
returns  Boolean 

7.  chock  validity  of  ontity  description  7. 


return  -ie-id  c state.E)  7.  not  alroady  ontity  with  same  id  7. 

A e-id. type  c schema.E  7.  valid  type  7. 

A e-id.chars  - e-chars  ||  dom  (e-id.chars) 

7.  id  matches  other  statement  of  characteristics  in  description  7. 

A dom  (e-chars)  » lc:C  | (e-id.type,c)  c dom  (schema,  c)  I 

7 correct  characteristics  defined  7. 


a dom  (e-id.chars)  * lc:C  I (e-id.  type, c)  < dom  (schema. c)  7.  proper  id  7. 

a schema,  fc(e-td.  type,  c)  - keyl 


a -Gc:C)(c  c dom(e-chars)  7.  characteristic  values  in  correct  domain  t 

a -de-chars  (c)  .type  - schema. c (e-id. type,  c) 

a schema.  sd(e-chars  (c) . type)  (e-chars  (c)  .value)  - true]  ) 
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valid-association  (sc!iema:G-schema,  stote:G-state,  o-ld:A-id,  a-rol es:Ro I e- funct  i on, 

a-cflars:Char-funct  ion) 

returns  Boolean 

7.  check  validity  of  association  description  7. 

return  ->(o-id  c stote.4)  7.  not  already  association  with  came  id  t 

A a-id.type  ( schema.A  7.  valid  type  t 

A a-id.roles  • a-roles  ||  dom la-id. roles) 

7.  id  roles  match  those  in  description  t 

A a-id.chars  » a-chars  ||  dom(o-id.chars) 

7.  characteristics  match  those  in  description  t 

A dom(o-cftors)  - lc:C  | (a-id.type, c)  c dom  (schema,  c)  I 

7.  correct  characteristics  defined  t 

a domlo-id. chars)  - (c:C  | (o-td.  type, c)  c dom(schemo.c) 

a schema,  fc  (a-id.  type,  c)  - key) 

7.  correct  characteristics  in  id  t 

a dom  (a-id.  roles)  - lr:R  | la-id.  type,  r)  c dom  (schema,  r) 

a schema,  frla-id.  type,  r)  .restrict  - restricted! 

7.  correct  roles  in  id  t 

a — 1 3c:  C ) (c  r dom  (a-chars)  7.  characteristic  values  in  correct  domain  X 

a -lo-c/lcrs  (c) . type  - schema. c (a-id.type,  c) 

a schema,  sdla-chars  (c) . type)  (a-chars  (c)  • value)  - true) ) 

a -Gr:R)(r  c dom(a-rotes)  7.  characteristics  of  roles  in  correct  domains  % 

a -fo-ro/es(r)  .type  = schema,  r (a-id.  type,  r) 
a dom(o-role.s  (r)  .chars)  « (c:C  | (a-roles(r)  ,type,c)  c dom  (schema,  c) 

a schema,  fc  (a-roles  (r)  .type,  c)  - key) 
a — C3c: C ) (c  < domlo-rotes(r).chars) 

a -'la-roles  (r)  .chars(c)  .type  * schema,  c (a-id.  type,  c) 
a schema,  sd  (a- roles  ( r) . chars  (c) . type)  (a-roles(r) . chars  (c) . value)  - true) ) ] ) 

Figure  A1.4  - valid-association  Predicate 
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Insert-md-entity  (scAema:G-schemat  stote:G-state,  e-desc:E-desc) 
returns  G-state 

7.  incart  independent  entity  7. 


let  e-td  - e-desc.E-id 

e-chars  - e-desc.chars 

in  if  valid-entity  ( schema , state,  e-id,  e-chars ) 

A -13r:R)  (3o:A)  (sc/iemo.r  (o,r)  - e-td.type 

A schema.fr  ( o,r). mand 

then  return  (stote.E  u le-tdl, 
state. A, 

state.D  U image  (e-cJiors), 
state.r, 

state.c  u X (e-id, e-chars)  ) 

else  return  error 


7.  proper  entity  description  7. 
7.  check  that  independent  7. 
mandatory) 

7.  insert  entity  id  7. 
7.  insert  new  domain  nodes  7 
t insert  characteristic  edges  7 


Figure  A1.5  - Insert-ind-entity  Operation  Type 
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Insert-ind-association  (scliematG-schema,  s totetG-state,  a-desc*- desc) 
returns  G-state 

X insert  independent  association  7. 

let  a-id  - a-desc. A-id 

a- roles  - a-desc.r oles 
a-chars  = o-desc.chars 

in  if  valid-association  ( schema , state,  a-id,  a-roles,  a-chars) 

t proper  association  description  X 

A image(o-id.roles)  c state.E  7.  all  entities  in  association  already  in  state  X 

then  return  Istate.E, 

state.A  u Iff-icfl, 
state.O  U image ia-chars), 
state.r  u X la-id, a-roles), 
state. c u X la-id, a-chars)  I 

else  return  error 

Figure  >17.6  - Insert-ind -association  Operation  Type 


7.  insert  association  id  X 

7.  insert  new  domain  nodes  X 

7.  insert  role  edges  X 

7 insert  characteristic  edges  X 


- 281 


Inset  r-semantic-umt  (acAemoKJ-schema,  atote:G-state,  arg-list:hrg-  list) 
returns  G-state 

7.  this  initiates  mutually  recursive  calls  of  insert-mand-entity,  7. 

7.  insert-mand-association  and  check-roles  in  ordar  to  delate  a semantic  unit  7. 

let  (state' , arg-list')  - insert-mand-entity  (schema, state, arg-list) 

7.  insert  first  entity  in  argument  list  (which  must  be  mandatory)  t 

in  if  arg-list’  - ni  I 7.  all  arguments  properly  inserted  7. 

then  return  state' 

else  return  error 


Figure  A1.7  - Insert-semantic-unit  Operation  Type  (main  procedure) 
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insert  mind  entity  (jc/iema;Gschema,  state:G-state,  arg-l  ist:Arg-  I i st ) 
returns  (G  state, Arg- 1 ist) 

7.  insert  dependant  entity  7. 

let  e-id  » head  (arg-L  1 3 1 ) .E-id  1.  get  description  from  argument  list  % 

e-chars  = head (arg-l is t).  chars 

in  if  valid  entity  (schema,  state,  e-id,  e-chars)  t valid  entity  description  X 

A (3r:R)  (3</:A)  (schema. r (a, r)  - e-id. type  t not  independent  * 

A schemaJr  (a,r) .m»nd  - mandatory) 

then  let  state'  - ( state. E U t e-id),  t insert  entity  id  X 

state. A, 

state.D  U image  (e-chars),  7.  insert  new  domain  nodes  * 
state.r, 

state.c  U X (e-id, e-chars)  ) 7.  insert  characteristic  edges  X 

in  return  insert-mand-association(5c/iema,3tate',tail(arp-tt5t), 

e-i d,f irst-a ssoc-role  ( s chema ) ) 

7.  next  element  on  list  should  be  mandatory  association  lor  this  entity,  X 
7.  to  find  which  which  role  and  association  this  should  be,  start  checking  X 

7.  with  "first”  association,  role  pair  in  schema  t 

else  return  error 


Figure  A1.8  - insert-mand-entity  Function 
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insert  mand-a ssociation 
returns  G-state 


(sc/iema:G-schema,  state:G-state,  arg-list'Arg-  list, 

e-id£-td,  (o:A,roi e£0 


) 


7.  for  a given  entity,  finds  and  insarts  all  rolas  in  associations  which  ara  7. 

7.  mandatory  for  this  antity  7. 

if  (a, role)  = last-assoc-role  (scAema) 

7.  if  last,  than  can  ba  no  mora  mandatory  rolas  for  this  entity  X 
then  return  (state, arg-list) 

else  if  schemaAr  (a,r).mand  = mandatory  % check  if  rola  is  mandatory  X 

A -.(3a-id:A-idHstate.Ha-id,r)  - e-id 
A o-id.type  - a) 

then  let  a-id  - head  (orp-f  istl.A-id 

a-roles  « headtarg-iisthroles 
a-chars  - head (arg-list). chars 

7.  it  mandatory,  association  should  ba  next  on  list  t 

in  if  valid -association  (schema,  state,  a-id,  a-roles,  a-chars) 

7.  valid  description  t 

A a- id. type  - a 7.  proper  type  X 

A o-roles(role)  *=  e-id  7.  entity  is  in  this  association  X 


then  let  state'  - (state.E, 

state. A u la-idl,  7.  insert  association  7 

state.D  U image  (a-chars),  7.  domain  nodes  X 
state.r  u X (a-id, a-roles)  7.  role  edges  X 

state.c  u X (a-id, a-chars)  I 

7.  insert  characteristic  edges  X 

inlet  (state" ',arg-list')  « 

check-roles  ( schema, s tate'  ,a-id,  tail  (arg- list), 
first-role  (a  .schema) ) 

7.  check  each  rola  in  association  to  sea  if  entity  has  been  inserted  X 

in  return  insert-mand-association(scAema,state' ' ,arg-list' , 
e-td,next-assoc-role(sc/iema,c,roie)  ) 

7.  check  next  association,  role  pair  to  sea  if  mandatory  lor  this  entity  X 

else  return  error 


else  return  insert-mand-associationlscAema, state, arp-fist, 

e-td,next-assoc-role(scAema,a,rote) ) 

7.  role  was  not  mandatory,  check  next  association,  role  pair  X 


Figure  A1.9  - insert-mand-association  Function 


r 
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check  roles  (schemo:G- schema,  state:G-state,  a-id:A-id,arg-list:Arg-l  ist,  roleJR) 
returns  (G-state,Arg-l  ist) 

7.  lor  a givan  association,  chocks  aach  rola  to  saa  if  antity  (or  rola  t 
7.  still  naads  to  ba  insartad  (a.g.  thara  ara  othar  dapandant  antitias  t 
7.  in  this  association).  7. 

if  role  - last-role ( a- idAype, schema)  t no  mora  rolaa  t 

then  return  lstate,arg-list) 
else  let  e-id  - state.r la-id, role) 

in  if  e-id  < state.E  t antity  alraady  insartad  7 t 

then  return  checY.-ro\eslschema,state,a-id,arg-list, 
next-role  (a-ftf.type, schema, role) ) 

7.  yas,  chack  naxt  rola  t 

else  let  lstate',arg-list')  « insert-mand-entity  (schema, state, arg-list) 

7.  no,  insort  tha  antity  t 

in  return  check-roles(schema,state',a-td,ar0-f  ist’, 
next-role ( a- td.type, schema, role) ) 

7.  chack  naxt  rola  t 


Figure  A1.10  - check -roles  Function 
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Delete  ind  association  ( 3 cAeroa:G- schema,  3tate:G-state,  a-id'A-\d) 
returns  G-state 

7.  delete  independent  association  7. 

if  a-id  ( state. A £ exists  in  state  t 

A -Or:R)  ( (a-id,r)  c dom  (3  tote.r)  t independent  t 

A scftema.lr  (a-id.type,r)  .mand  • mandatory 
A -Go-td':A- id)  (states ta-id',r)  « states  (a-id, r ) 

A o-td'.type  « a-id. type 
A a-td’  0 a-id) ) 


7 delete  id  t 

ld:D|  ( 3c:C ) (state.c(a-id,c) -d 

A -Gx:ld)  (x  *>  a-id 

A state.e(x,c)  - d)  ) I, 

7.  delete  domain  nodes  not  used  elsewhere  X 

state.r  ||  I ( a-id ’:A- id, r:R)  | (a-id',r)  t dom  Utate.r) 

A a-id  * a-id'\t 
7.  delete  role  edges  1 

state.c  ||  l(x:ld,c:C)  | (x,c)  c dom(state.c) 

Ax*  «t-idl  ) 

7.  delete  characteristic  edges  * 


then  return  ( state! , 

state. A - 
state.D  - 


else  return  error 


Figure  A1.12  - Delete-ind-association  Operation  Type 
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Delete-semantic-unit  ( schema:G- schema,  state:G-state,  a-id: A-id) 
returns  G-state 

7.  delete  semantic  unit  by  mutually  recursive  calls  of  Delete-semantic-unit,  7. 

1.  delete-rotes  and  delete-mand-entity  7. 

if  a-id  t state. A 7.  association  in  state  X 


then  if  -(3r:R)  ( fa-id, r)  c dom(stote.r) 

A schema.fr  la-icf.type,r).mand  - mandatory 
A -(3a-td*:A-id)  (state.r(a-id',r)  - state.r fa-id, r) 

A a-td'.type  - a-td. type 
A a-id'  m a-id)) 

7.  error  if  association  not  in  semantic  unit  t 


then  return  error 


else  let  state'  - fstate.E, 

state.A  - la-idl,  7.  delete  association  id  t 

state.O, 

state.r, 

state.c) 

in  let  state'"  * delete-roles  (schema, state', a-id, 

first-role  ta-id.type, schema) ) 

7.  start  checking  roles  to  delete  dependent  entities  in  association  t 

in  return  (state*  '.E, 

state*  '.A, 

state". D - ldd]|  (3c.*C)  (state*  *.c(a-td,c) -d 
A -'(3x:Id)  (x  ■*  a-id 

A state*  *.c(x,c)  - d)  ) I, 

7.  delete  domain  nodes  not  used  elsewhere  t 

state*  *.r, 

state". c ||  l(x:Id,c:C)  | (x,c)  c dom  (state*  *.c) 

A x 0 a-idl  ) 

7.  delete  characteristic  edges  t 


else  return  error 


Figure  A1.13  - Delete-semantic-unit  Operation  Type  (main  procedure) 
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delete  roles  (sd)cma:G-schema,  state;G-state,  a-id:A-id,  ro  l e:ro I e) 
returns  G-state 

t delsle  each  dependent  role  in  association  l 

if  role  - last-role (a-id.lype, schema)  Z no  moro  roloa  * 

then  return  state  z dono  Z 

else  let  e-id  - state.r la-id, role) 
state'  - (state.E, 

state.A, 

state.O, 

state.r  ||  I (a~td':A-id,r:R)  | (a-id’,r)  c dorr*  (state.r) 

A la-id', r)  0 la-id, role)  I, 
t doloto  rolo  td|t  Z 


state.c) 


in  if  e-id  < state'.E  1 entity  still  in  state  Z 

A schema.fr  (a-id.lype, role)  .mand  - mandatory 
A -(3a-trf’:A- id)  (state.r la-id', role)  - e-id 
A a-id’.type  - a-id.type 

A a-id'  0 a-id)  Z dependent  Z 

then  let  state”  « (state's  - le-tdl,  Z delete  entity  id  X 
state'. A, 

state’ .0  - ld:0|  (3c:C)  (state'. cle-id,c)  -d 
A -(3x:ld)  (x  0 e-id 
a state’.c(xfc)  - d))l, 

Z delete  domain  nodes  not  used  elsewhere  Z 

state',  r, 

state'.c  ||  <(x:Id,c:C)  | (x,c)  < dom(state'.c) 
Ax*  e-tdl  ) 

1.  delete  characteristic  edges  Z 

in  return  delete-roles  Ischema,  Z check  next  role  Z 

delete-mand  -entity  (schema, 
state' ', 
e-id, 

first-aid-rolefstate’  * ) ) 

Z delete  dependant  entity  Z 

a-id, 

next-role  I a-id.(ype, schema  ,role)) 

else  return  delete-roles  (schema, state’, a-td,  Z check  next  role  Z 

next-role  (a-id.lype, schema, role) ) 


Figure  A1.14  - delete-roles  Function 
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delete-mand-entity  ( 3 chema:G- schema,  s tate:G-state,  e-idf-id,  (a-id:A-id,role:R) ) 
returns  G-state 

7.  checks  to  see  if  entity  is  mandatory  in  other  semantic  units  7. 

if  (a-fd,ro/e)  = last-aid-role  (state)  7.  last  role  7. 

then  return  state  7.  done  7. 

else  if  state.r  (a-id, role)  = e-id  7.  entity  in  association  X 

A a-id  < state.A  7.  association  in  state  7. 

then  if  schema.fr(a-id.type,rote).mand  ■ mandatory  7.  dependent  X 

A -Ga-id':A-id)  (state.r(a-i<f ’,ro!e)  - state.r  (a-id, role) 

A a-id'Aype  = a-td.type 
A a-id'  * a-id ) 


then  return  delete-mand-entity  (sc/iema, 

Delete-semantic-unit  ( s chema,s  tate,a-id), 

7 delete  the  semantic  unit  containing  this  entity  and  association  X 
e-id, 

next-aid-role  (state, a-id, role) ) 

7.  check  next  role  X 


else  return  error 

else  return  delete-mand-entity (sc Aetna,  7.  check  next  role  X 

state, 

e-id, 

next-aid-role(state,a-f  d,rofe) ) 

else  return  error 


Figure  A1.15  - delete-mand-entity  Function 
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Appendix  2.  The  Semantic  Relation  Data  Model: 
Formal  Operation  Specifications 


R-schema  - (Rel:  [R-names  -♦  Heading], 

Con:  Powerset  (Constraints), 
R-names, 

P. 

S, 

E, 

D, 

C, 

sd:  ID  -*  [Str  -*  Boolean]], 

Str) 

R-state  - [R-names -♦  Relation) 


Heading 

H 

Relation 

Tuple 

Char  funct i on 

Domain-element 

Case-col 

Co  I - 1 i s t 

Constraint 

Boolean 

R-names 

P 

S 

E 

D 

C 

Str 


- CH  - CC  - OJ 1 

■ (Powerset*  (P  x S)  x El  U P 

• Powerset (Tuple) 

- [H  -*  Char  funct  ion] 

• (C -*  Domain-element) 

» (typed],  value:Str) 

- (predicate-casezPowerset’  (P  x S),  typef) 

■ Case-col* 

- I ( [R-names -*  Heading) , R-State)  -»  Boolean] 

■ (true,  false) 

- relation  names 

- predicates 

• semantic  case  names 

- entity  set  names 

- domain  names 

- characteristics 

• character  strings 


Figure  AZ.l  - Type  Specifications 


( 
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image:  11*  ■*  y]  ■*  Power  sett/)} 
dom:[[x  -•  y]  -*  Powerset(x)] 

These  functions  return  the  domain  and  image  of  a function. 


X:l  (x  x ly  -*  z))  ■*  [ ( x x y ) -»  z] 

This  function  combines  a constant  into  the  domain  of  a function. 

| |:  t ( lx  -*  y]  x Power  set  (x) ) -*  (x  -♦  y]  ] 

This  function  returns  a function  which  is  the  same  as  its  first 

argument  restricted  to  the  domain  specified  by  the  second  argument. 

head:  x*  -»  x 
tail:  x*  - x* 

These  functions  return  the  first  element  and  the  remainder  of  a list 
respectively. 

<x  | P(*)> 

This  means  "the  unique  x such  that  P(xl".  Evaluates  to  error  if  there 
is  no  such  x or  there  is  more  than  one  such  x . 

<xj,x 2.  . . . xn) ±i  - x-  (selection  function) 

Figure  A2.2  - Built-in  functions  and  notation 
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Definition  A2.1:  Ordering  of  Domain  Elements 

For  domain  element  types  d,d'  c D and  strings  x,y  c Str, 

(d.  x)  > (d, y)  if  and  only  if 

1.  d - d* 

and  2.  i).  x = y 

or  ii).  x-"—"  and  y-"— \ 

Definition  A2.2;  Ordering  of  Column-group  Headings  ( Elements  of  H) 

For  h,h’  c H,  p.p’  c P,  pc-set, pc-set’  c Power se t*  (P  x S)  and  e c E, 
h>h’  if  and  only  if 

1.  h - h’ 

or  2.  i.  h « (pc-set,  e)  and  h*  ■ (pc-set’,  e) 

and  ii.  pc-setDpc-set’ 

Definition  A2.3:  Ordering  of  Tuples 
For  t, t’  c Tuple, 

1.  If  dom(t)  = domlt’), 

then  t > t ’ if  and  only  if 

(Vh:H)  (htdom( t)  z>  dom(t  (h) ) =dom(t’  (h) ) 
a (Vc:C)  (ccdom(t  (hll  o t(h)  lc)>t'  (h)  (cH) 

This  is  for  the  case  of  tuples  with  identical  headings. 

2.  If  dom(t)  x dom(t’), 

then  t>t’  if  and  only  if 

there  is  a 1-1  onto  correspondence  f:dom(t)  -*  dom(t’)  such  that 

(Vh:H)  (hcdom(t)  o dom(t  (h)  )-dom(t’  (f  (h)  1 ) 
a (Vc:C) (ccdom  ( t (h) ) o t (h) (c) >t’ (f (hi) (c) ) ) 

This  is  for  the  case  of  non-identical  headings. 


Figure  42.3  - Ordering  Definitions 
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valid  rels  ( s chemaiR- schema,  rels:R-state) 
returns  Boolean 

X checks  that  relations  are  of  proper  form  X 

return  dom(rets)  «=  domlscAema.Rel) 

A [ (Vr-aJl-names)  (r-n  e dom(rels) 

o l(Vt:Tuples)  (t  ( relslr-n ) X for  each  tuple  In  each  relation  X 

d (dom(t)  » dom  (scAema.Rel  ( r-n) ) X correct  column-groups  X 

A (3f»dH)  (3c:C)  (A  c dom(t)  X at  least  one  non-null  value  X 

acc  dom  ( t (A) ) A t(A)  (c). value  * " ") 

a f VAdd)  (A  c dom(t) 

d (domlt(A))  = dom IscAemc.Rel  (r-n)  ( A) ) 

X correct  characteristics  X 

A (Vc:C)  (c  c dom(t(A) ) 

d tt(A)  (c).type  - scAema.Rel(r-n)  (A)  (c) 

A scAemo.sd  ( JcAemo.Rel  ( r-n)  (A)  (cl)  (t(A)  (c).value)  - true!  HID] 

X values  In  proper  domains  X 

A -'(3t:  Tuples)  (3t ' : Tuples)  (tc  rets  (r-n)  a t’t  rels(r-n)  At>t'))l 

X all  tuples  are  Incomparable  under  partial  order  X 


Figure  A2.4  - valid-rels  Predicate 
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Insert-tuples  (scAema:R-schema,  state:R-state,  addi tioJcR-etate) 
returns  R-state 

X Insert  statements  Into  relations  X 


if  valid-re1s(schema,addition)  X addition  In  proper  fore  X 

A valid-rels (schema, s tote)  X state  In  proper  form  X 

A (Vr-icR-names)  (r-n  c dom (addition) 

d -GtiTuple)  (3t':Tuple)  (t  c additionlr-n)  a t'  c state(r-n) 

A t * > t)  ) 

X not  attempting  to  Insert  statements  already  there  X 
X or  statements  less  than  those  already  In  state  X 

then  let  state'  - Xr.  (s-umon(s tote (r) , addition ir) ) ) X do  Insertion  X 

in  if  (VconrConstraint)  (con  < schema.Con  X check  constraints  X 

3 con(schema.Rel, state')  - true) 

then  return  state'  X ok  X 

else  return  error  X constraints  failed  X 


else  return  error 


Figure  A2. 6 - Insert-tuples  Operation  Type 


s-union  ( r-l:Re  I at  i on, r-2:Re  I at  i on) 
returns  Relation 

X performs  union  of  relations,  deleting  statements  which  X 
X are  less  than  some  other  statement  X 

return  lt:Tuple  | t < r-lu  r-2 

a -'(3t,:Tuple)  (t*  c r-1  U r-2  a t * > t) I 

Figure  A2.6  - s-union  Function 
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Delete-tuplesIsc/iemazR-schena,  3 tcteft-state,  delettofttf-state) 
returns  R-state 

X delete  tuples  from  relations  X 

if  valid-relslscftemr, state)  X state  In  proper  fora  X 

A (Vr-it:R-names)  (r-n  < dom (addition) 

3 (Vt:Tuple)(t  t deletion(r-n) 

o Gt’iTuplel (t*  ( state! r-n)  a t'  > till 
X each  tuple  In  deletion  Is  less  than  or  equal  to  some  tuple  In  state  X 

then  let  state'  - Ar.  (s-difference(state(r) , deletion (r) ) ) X perform  deletion  X 

in  if  (Vco»:Constraint)  (con  c schema.Con  X check  constraints  X 

d con  (schema.Rel, state')  - true) 

then  return  state’  X ok  X 

else  return  error  X constraints  failed  X 

else  return  error 


Figure  AZ.7  - Delete-tuples  Operation  Type 
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s-d  i f ference  ( r- 1 :Re  I a t i on,  r-2JRe  I a t i on ) 
returns  Relation 

X delete  statements  from  relations  X 

X make  all  statements  In  r-l  less  than  statements  In  r-2  X 

X by  changing  values  to  null  or  deleting  statement  If  necessary  X 

return  (r-/ - »t:Tuple  | (3t's TupleXt  c r-l  a t'  i r-2  a t>t')l) 

X remove  tuples  greater  and  equal  to  those  In  deletion  X 

U (t’':Tiaple|  (3t: Tuple) (3t’s Tuple) (t  c r-l  a t*  c r-2  a t > t* 

X statement  In  r-l  is  greater  than  statement  In  r-2  X 
X add  back  lesser  statement  (If  one  exists)  to  r-l  X 
a (VA:H) (VA'rH) (Vc:C) 

(Acdom(t)  a ccdom(t(A))  a A'cdom(t’)  a ccdom(t'  (A*)  1 

s (((ft  -ft'  a t(fc) (c)-t' (ft)  (cl) 
d t"  (ft)  (c)  =H — ") 

X the  case  of  equal  headings  and  values  - change  value  to  null  X 

a ( (ft>H*  v (ft-ft’  a t(ft) (c)>t '(ft) (c)  )) 
d f(ft)(c)-t(ft)(c)))) 

X otherwise  leave  value  alone  X 

a dom(t' ')  - dom(t)  X correct  form  for  resulting  tuple  X 
a (Vft:H)  (ftcdom(t)  DdomU'Mft))  -dom(t(A))) 

a (3ft:H)  (3c:C)  (ft  ( dom(t")  X at  least  one  non-null  value  X 
acc  dom(t*  * (ft) ) a t'  * (ft)  (c)  .value  * " — **)  ) ) 


Figure  A2.8 


s-difference  Function 


Appendix  3.  The  Semantio  Relation  Data  Model: 
Formal  Constraint  Specifications 


Project  (new-deadheading,  (deadheading,  rhe  I at  ion) ) 
returns  (Heading, Relation) 

X eliminates  columns  and/or  predtcate:case  pairs  from  heading  X 

if  dom'aew-Aead)  * 0 X cannot  project  to  empty  heading  X 

A (VAh)  (A  c dom(  new- dead)  X check  form  of  new  heading  X 

s (3A’h)  (A'cdom(Aeed)  a A<A* 
a docn(A)  c dom(A’) 
a (Vc:C)  (c  c dom(A) 
o A(c)  - A’ (c) ) ) ) 

then  let  r’  - (t:Tuples  | dom(t)  - dom(  new- dead)  X tuples  of  proper  fora  X 
a (VA:H>  (A  < dom(t) 

:>dom(t(A))  » dom(neif-Aead(A) ) 

a (3A:H)  (3c:C)  (A<dom(t)  X exists  at  least  one  non-null  value  X 
a ccdom(  t(A) ) a t (A)  (c>-“ — ") 

X values  in  projected  tuple  equal  to  those  In  tuple  of  r X 
a (3t Muples)  (t * c r 
a (VA:H)  (Vc:C) 

((A  c dom(t)  acc  dom(t(A))) 

o (t(AHc)  - t*  (<A’h|  A'cdom(Aead)  a A<A'>)  (c) ) ) ) ) 
in  return  (nea-Aead/’) 
else  return  error 


Figure  A3  A - Project  Constraint  Operation  Type 
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Equal  ( ( headi ng- J dlead  i ng,r- l&e  I a t i on) , ( heading-2* lead i ng,r-£Re  I a t i on) ) 
returns  Boolean 

X checks  If  two  relations  are  Identical  X 
if  heading- 1- headi ng-2  a r-1  - r-2 
then  return  true 


else  return  false 


Figure  A3.2  - Equal  Constraint  Type 


Col-func  ( headi  ng-lMea d i ng, headi  ng-2: Head  i ng,  1 headi  ngiHead  i ng, re l fle  I a t i on) ) 
returns  Boolean 

X Checks  that  no  two  tuples  In  rel  are  equal  In  columns  specified  X 
X by  heading- 1,  but  unequal  In  columns  specified  by  heading-2.  X 
X It  Is  not  required  that  tuples  with  null  values  In  the  columns  of  heading-i  X 
X need  have  equal  values  In  heading-2  X 

if  -Gt)  Gt')  iUrel  a t’crel 

a (V/idH)  (Vc:C)  lhidomlheading-1)  a C(domlheading-Hh) ) 
d t(A) (c)-t’ (A) (c)  a t(A)(c)»" — ") 
a GAdO  Gc£)  (h(domlheading-2)  a ctdomi  headi  ng-2  (h) ) 
a tIA) (c)*t' (A) (c))) 


then  return  true 
else  return  false 


Figure  A3.3  - Col-func  Constraint  Type 
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Subset  ( ( heading- l'JHead  i ng, r-lfle  I at  i on) , ( heading-2: Head  i ng,r-<hRe  I at  i on) ) 
returns  Boolean 

X Check  that  characteristic  values  In  r-1  tuples  are  a subset  X 
X of  those  In  r-2  tuples.  X 

X Ooes  not  require  that  characteristics  describe  same  predicate :cese  pair  X 
X or  same  predicate.  X 

if  |dom(headtit0-l)  |-|domU»eodinp-2)  |-1 

X each  relation  may  describe  only  a single  predicate  or  X 

X set  of  predicate :case  pairs  X 

then  if  lc/  | (3t:  tuple) (tc r-1  a I c/I -Image  It))) 

g I cf  | <3t: tuple) 1 tc r-2  a Ic/I -Image (t) )) 

then  return  true 

else  return  false 

else  return  error 


Figure  A3. 4 - Subset  Constraint  Type 


Equate  ( (heading-!  -.Head  i ng,r- l:Re  I a t i on) , (heodt  ng-2Me  ad  i ng,r-2Ae  I at  i on)  ) 
returns  Boolean 

X check  that  characteristic  values  In  r-i  tuples  are  equal  X 
X to  those  In  r-2  tuples.  X 

X Does  not  require  that  characteristics  describe  sane  predicate: case  pair  X 
X or  same  predicate.  X 

if  | doml heading-1)  |-|dom( head! *g-2)  |-1 

X each  relation  may  describe  only  a single  predicate  or  X 

X set  of  predicate :case  pairs  X 

then  if  I cf  | (3t:  tuple)  (tcr-1  A lc/1 -image (t))) 

- Ic / | (3t:  tuple)  (tc r-2  /\  lc/)  -imaged) ) I 

then  return  true 

else  return  false 

else  return  error 


Figure  A3.6  - Equak  Constraint  Type 
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Non-null  ( 0»ead1*g-i:Headi  ng,r-/:fte  I at  ion) ) 
returns  Boo  I ean 

X checks  that  no  null  values  are  present  In  relation  X 

if  -(3t:Tuple)  GAjH)  (3c:C)  (tc r-1 
a Acdom(t) 

a c<dom(t(A) ) a t (Ik)  (c)-“ — ") 

then  return  true 
else  return  false 


Figure  A3.6  - Non-null  Constraint  Type 
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Case-join  (coi-2:Case-col,co/-/:Case-col,(Ae(itfifl0-2:Heading,r-2tfelat  ion), 
(/ieadtng-/:Heading,r-l:Re  I at  ion) ) 
returns  (Heading, Relation) 

X Add  to  col-1  In  r-1  those  characteristics  defined  In  col-2  of  r-2  X 
X Match  tuples  according  to  equal  values  In  common  characteristics  X 
X of  col-1  and  col-2  X 

if  col-1  < dom (heading-1) 

A col-2  t domiheading-2 ) 

A col-1. type  - coi-2.type  X same  entity  type  X 

A (Vc.*C)  (adom(heading-Kcol-l) ) A c(dom(heading-2(col-2) ) 
d heading- l(col-l)  (c)  - heading-2(col-2) (c) ) 

X same  domains  for  common  characteristics  X 


then  return  ( (\h‘.  if  h'  - col-1  X heading  definition  X 

then  return  (Xc'.if  c'  t dom(coi-i) 

then  return  coi-J(c') 
else  return  col-21 c')) 
else  return  heading-l(h')), 

lt:Tuple|  (3t':Tuple)  (3t * ’:Tuple)  X tuple  definition  X 

X consider  tuple  from  each  relation  X 


(t*  t r-1  A t"  i r-2 


Adom(t)  - dom(t') 

A dom  (t  (col -i) ) - dom(t*  (col-i) ) U dom(t*  ’ (cof-2) ) 

X add  characteristics  from  r-2  X 


A (VA:H)  (h*col-l  d t (D) -t ' (A) ) 


X values  equal  to  those  In  r-1  for  all  columns  other  than  those  Joined  X 
A ( Vc;C)  ( (c<dom(t'  lcol-1) ) 

d f (col-1) (c)-t(c ol-l) (c) ) 

t values  equal  to  those  In  r-1  for  characteristics  defined  In  r-1  X 
A (ccdomlf  (col-2))  d t”  (col-2)  (c) -t(cot-I)  (c) ) )>l ) 

X values  equal  to  those  In  r-2  for  characteristics  defined  In  r-2  X 


else  return  error 


Figure  A3.7  - Case-join  Constraint  Operation  Type 
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Predicate-join  (pred&,(heading-2:Head\ng,  r-2:Relat  ion) , 

(headi  ng- 1 '.Heading,  r-f :Re I at  ion) ) 


returns  (Heading, Re  I at ion) 


X Add  to  r-1  the  Information  In  r-2  concerning  the  predicate  pred  X 
X Hatch  tuples  according  to  equal  values  for  common  cases  and  characteristics  X 
X of  pred  X 


if  (Vc:C)  (ccdom  (heading-l(pred))  A cidom(heading-2lpred) ) 

^ heading- l(ftred)  (c)  = heading-2(pred)  (c) ) 

X predicate  characteristics  common  to  both  relations  defined  on  same  domains  X 


X all  cases  of  pred  common  to  both  relations  are  defined  for  same  X 
X entity  type,  same  characteristics  and  same  domains  X 
A (Vco/:Case-co I ) (Vcol  ':Case-col ) (Vs:S) 

(coUdom(heading-l)  A (cot  'cdomUieadtng-2) 

A (pred,s)  ( col. predicate-case 
A (pred,s)  c cot  ’.predicate-case 
o cot. type  = cot  '.type 

A heading-1 (col ) - heading-2(col ')) 

X pred  not  combined  with  any  other  predicate  In  any  column  of  r-2  % 
A (Vcot ':Case-col ) (Vs:S) 

(cot*  ( dom  (heading-2)  A (pred, 5)  c cot ’.predicate-case 
o cot ’.predicate-case  - ( ( pred,s)  \ ) 

then  let  new-cols  - lcot:Case-col  | (3s:S)  (col  c dom ( headi ng-2) 

A t (pred, s)l  - cot. predicate-case 
A ->(3coi  ’iCase-col ) (cot  ’ c dom ( heading- 1) 

A ( pred,s ) c cot  ’.predicate-case) ) I 

X note  those  cases  of  pred  In  r-2  which  are  not  present  In  r-1  X 

in  let  new-head  - \h'.  if  h'  - pred  X define  new  heading  X 

then  return  ( Ac  ’.if  c'  c do  m(col-l) 
then  return  cot-i (c ’ ) 

X pred  characteristics  defined  In  r-1  X 
else  return  cot-2(c’)) 

X pred  characteristics  defined  In  r-2  but  not  r-1  X 
else  if  h'  t new-cols 

then  return  heading-2lh')  X new  cases  X 
else  return  heading-Hh’) 


Figure  A3.8  - Predicate- join  Constraint  Operation  Type 
(Continued  on  A text  Page) 
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II' 

X define  tuples  of  resulting  relation  X 
inlet  r*  - (t:Tuple  I (3t’:Tuple)  (3t":Tuple) 

X consider  tuple  from  each  relation  X 
( t * c r-1  A t"  c r-2 

A dom(t)  «dom(t’)  u new-cols  u 1 pjP | p=  pred  A pcdomlt'*)) 

A dom  ( t ( pred) ) =dom  ( t ' ( pred ) ) U dom  ( t * ' ( pred) ) 

A (VA:H)  ( (A  c dom(  t ’ ) -predo  t ' ( A)  - t (A) ) 

X values  equal  to  those  In  r-1  X 
A (A  < new-cols  o t(/i)  » t * * ( /») ) 

X values  of  cases  for  pred  not  In  r-1  equal  to  those  In  r-2  % 

X values  equal  to  those  In  r-2  for  cases  of  pred  X 
A (3col:Case-col ) (col  c dom(t'*) 
a h d dom(t)  A coi  < h 

o t"(col)  - t(A))) 

X values  equal  to  those  In  r-1  and  r-2  for  characteristics  of  pred  X 
A (Vc:C) 

( (c  < dom(t'  (pred) ) 

d t' (pred) (c )=t(pred) (c) ) 

A (c  c domU’  ’ (pred) ) 

5 t"  (pred)  (c)-t(pred)  (c) ) ) ) ) 

in  return  (new-head, r'i 
else  return  error 


Figure  A3.8  - Predicate-join  Constraint  Operation  Type 
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Conjunction tcol-list-l£o\- 1 ist,col-ltst-2:Col-l  iet, 

( heading- l:Head  i ng  ,r-ldRe  I at  i on) , 

( heading-2:He3d  i ng,r-2:Re  I at  i on) ) 
returns  (Heading, Re  I at ion) 

X combine  relations  describing  different  predicates  Into  a single  relation.  X 
X The  columns  of  col-list-1  In  r-1  are  combined  In  sequence  with  X 
X the  columns  of  col-list-2  In  r-2  X 

let  n - length(col-ltst-f) 
m - length  ( co l-  lis  t-2) 
col-set-1  - tcol-list-HH 

col-set-2  - tcol-l tst-2Jtl 

in  if  col-set- 1 c dom (heading-1)  A col-set-2  g dom ( heading-2) 

A n - m A |col-set-f|  - n A |col-set-2|  - m 

X lists  of  same  length  with  no  duplicates  X 
A (V  l<i<n)  ((col-Hst-lU).type  - (col-list-2ii). type 

A heading-1  (col-list-Hi)  - heading-2(col-list-2ii) ) 

X corresponding  columns  describe  same  entity  type  and  same  characteristics  X 

X check  that  sets  of  predicates  In  the  two  relations  are  mutually  exclusive  X 
A -13 p-J3)  (3s:S)  (3s  *:S)  (3co/:Case-col ) (3col  'iCase-col ) 

( ( p,s ) c col. predicate-case 
A (p,s ')  c col '.predicate-case  . 

A col  c dom iheading-1) 

A col’  c dom  (heading-2)) 

A -’(3 p-J3)  ( p c dom  (heading-1) 

A p ( dom  (heading-2) ) 

then  let  col-list-ne w - consJ_j  ( (col-list-l±1). predicate-case  U 

(col- 1 is t-2ii). predicate-case,  tcol-list-lli)  .type) 

X new  columns  describe  union  of  predicate: case  pairs  In  Joined  columns  X 

in  let  col-set-new  - Icol-f tst-neelil 

in  let  new-r  - ( t:Tuple  | X define  tuples  In  resulting  relation  X 
dom(t)  - dom theading-1)  - col-set-1 

U dom  (heading-2)  - col-set-2 
u col-set-new 


Figure  A3.9  - Conjunction  Constraint  Operation  Type 
(Page  1 of  3) 


X case  1 - natch  of  equal  values  In  Joined  columns  X 

A <U3t':Tuple)  (t'  c r-1 

A (VA:H)((A  c dom  (heading- 1)  - col-set-1 

3 t'(A)  - t (Ik)  > 

A (A  c col-set-1 

3 t ' (Ik)  - t <<A'  | A*  < col— set— new  a »£**>)))) 

A Gt*:Tuple)  ( t * c r-2 

A ( V/txH)  ( ( A c dom  (Aeatf tnp-2)  - col-set-2 

3 t ’ (Ik)  - tt (Ik) ) 

A (A  < col-set-2 

D t • (Ik)  - t(<A'  | A*  < col— set— new  a *£**>))))) 

X case  2 - no  match  In  r-2  for  tuple  In  r-1  - pad  with  null  values  X 
v [ Gt’:Tuple)  (f  c r-1 

A (VA:HM(A  c dom (heading-1)  - col-set-1 

o rih)  - t (*> ) 

A (A  < col-set-1 

3 t * (Ik)  - t(<A*  | A ’ c col— set— new  a A£A*>)))) 

a — (3t ' : Tuple) It*  c r-2 

a (Vlks H)  ( A c col-set-2 

3 t * ( Ik)  - t(<A’  | A'  c col— set— new  a A£A’>))) 
a (VA:HMA  f dom  (heading -2)  - col-set-2 

3 <dom(t(A))  - dom (headi ng -2(h) ) 

a (Vc:C)  (ccdom<t<A))  3t(A)(c)«" ")))] 

X case  3 - no  match  In  r-1  for  tuple  In  r-2  - pad  with  null  values  X 
v t(3t’:Tuple)  it’  < r-2 

a (Vlk: H)  ( ( Ik  c dom i heading-2)  - cot-set-2 

d t*  (Ik)  - t (Ik) ) 
a (A  c col-set-2 

3 t • ( A)  - t(<A’  | A’  c col-set-new  a A£A'>)U) 

a -■  Gt  * : Tup  I e)  ( t ’ c r-1 

a (VA:H) (A  c col-set-1 

3 t * ( A)  - t(<A*  | A'  c col-set-new  a A£A’>))) 
a (VA:H)  (A  < dom  (A eading-1)  - col-set-1 

3 (dom(t(AH  * dom  (heading -lih)) 

a (Vc:C)  (ccdom(t(A) ) 3 t(A)  (c»-~ ")))))) 


Figure  A3. 9 - Conjunction  Constraint  Operation  Type 
(Page  2 of 
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in  let  new-head  - XA'.if  h’  c col-set-new  X define  new  heading  X 

then  return  heading-H<hd 1 | h c col-set- 1 

a *<**>) 

else  if  h'  c col-set-1 

then  return  heading-Hh’) 
else  return  heading-2 Ih *) 


in  return  (see- head, new- r) 
else  return  error 


Figure  A3. 9 - Conjunction  Constraint  Operation  Type 
( Page  3 of  3) 


- 311  - 


Biographio  Note 


Sheldon  Asher  Borkln  was  born  November  16,  1952  In  Detroit,  Michigan 
and  was  raised  in  Huntington  Woods,  Michigan.  Mr.  Borkln  attended  Berkley 
High  School  in  Berkley,  Michigan  where  he  received  a National  Merit  Award, 
the  Science  Department  Award,  and  the  Rensselaer  Mathematics  and  Science 
Award.  He  was  a member  of  the  National  Honor  Society  and  president  of  the 
Science  Research  Project.  He  graduated  high  school  summa  cum  laude  in  June 
1970. 

Beginning  in  September  1970,  Mr.  Borkln  attended  the  University  of 
Michigan  in  Ann  Arbor  where  he  was  in  the  Honors  Program.  In  May  1974, 
he  received  the  Bachelor  of  Science  degree  with  high  distinction  and 
departmental  honors  in  the  Computer  and  Communication  Sciences  Department. 
At  the  University  of  Michigan,  Mr.  Borkln  received  the  Branstrom  Prize,  was 
elected  a member  of  Phi  Beta  Kappa,  and  was  named  both  a Regents-Alumni 
Scholar  and  an  Angell  Scholar. 

Mr.  Borkln  has  attended  the  Massachusetts  Institute  of  Technology  in 
Cambridge,  Massachusetts  since  February  1975  where  he  has  been  associated 
with  the  Computation  Structures  Group  of  the  Laboratory  for  Computer  Science. 
He  received  the  Master  of  Science  and  Electrical  Engineer  degrees  in  June  1978 
from  the  Department  of  Electrical  Engineering  and  Computer  Science.  His  thesis 
title  was  "Data  Model  Equivalence."  Mr.  Borkln  was  a Teaching  Assistant  in  the 
Department  of  Electrical  Engineering  and  Computer  Science  from  September  1976 
until  May  1978.  In  May  1978,  he  received  a Departmental  Teaching  Award. 

He  was  initiated  into  Sigma  Xi,  the  honorary  scientific  research  society  in 
April  1976.  Mr.  Borkin  is  also  a member  of  the  Association  for  Computing 
Machinery  and  the  Institute  of  Electrical  and  Electronics  Engineers.  He  has  been 
married  to  his  wife,  Sandra,  since  July  1973. 

Mr.  Borkin  was  employed  at  the  Great  Lakes  Environmental  Research 
Laboratory  in  Detroit,  Michigan  during  the  summers  of  1972,  1973  and  1974. 
He  was  employed  by  Bell  Telephone  Laboratories  in  Piscataway,  New  Jersey 
from  August  1974  until  February  1975.  During  the  summer  of  1977, 
Mr.  Borkin  was  associated  with  the  Charles  Stark  Draper  Laboratory  in 
Cambridge,  Massachusetts.  Since  December  1978,  Mr.  Borkin  has  been  employed 
by  the  IBM  Cambridge  Scientific  Center  in  Cambridge,  Massachusetts  where  he  is 
continuing  research  in  the  field  of  database  systems  and  their  user  interfaces. 

Mr.  Borkin  has  published: 

Bor!  in,  S.A.,  "Data  Model  Equivalence",  Proceedings  of  the  Fourth  International 
Conference  on  Very  Large  Data  Bases.  IEEE,  September  1978.  (also  published  as 
MIT  Laboratory  for  Computer  Science  Technical  Memo  118  and  Computation 
Structures  Group  Memo  158) 


l 


OFFICIAL  DISTRIBUTION  LIST 


Defense  Documentation  Center 
Cameron  Station 
Alexandria,  VA  22314 
12  copies 

Office  of  Naval  Research 
Information  Systems  Program 
Code  437 

Arlington,  VA  22217 
2 copies 

Office  of  Naval  Research 
Branch  Office/Boston 
495  Summer  Street 
Boston,  MA  02210 
1 copy 

Office  of  Naval  Research 
Branch  Office/Chicago 
536  South  Clark  Street 
Chicago,  IL  60605 
1 copy 

Office  of  Naval  Research 
Branch  Of f ice/Pasadena 
1030  East  Green  Street 
Pasadena,  CA  91106 
1 copy 

New  York  Area  Office 
715  Broadway  - 5th  floor 
New  York,  N.  Y.  10003 
1 copy 

Naval  Research  Laboratory 
Technical  Information  Division 
Code  2627 

Washington,  D.  C.  20375 
6 copies 

Assistant  Chief  for  Technology 
Office  of  Naval  Research 
Code  200 

Arlington,  VA  22217 
1 copy 

Office  of  Naval  Research 
Code  455 

Arlington,  VA  22217 
1 copy 


Dr.  A.  L.  Slafkosky 
Scientific  Advisor 
Commandant  of  the  Marine  Corps 
(Code  RD-1) 

Washington,  D.  C.  20380 
1 copy 

Office  of  Naval  Research 
Code  458 

Arlington,  VA  22217 
1 copy 

Naval  Electronics  Lab  Center 
Advanced  Software  Technology 
Division  - Code  5200 
San  Diego,  CA  92152 
1 copy 

Mr.  E.  H.  Gleissner 

Naval  Ship  Research  & Development  Center 
Computation  & Math  Department 
Bethesda,  MD  20084 
1 copy 

Captain  Grace  M.  Hopper 
NAICOM/MIS  Planning  Branch 
(0P-916D) 

Office  of  Chief  of  Naval  Operations 
Washington,  D.  C.  20350 
1 copy 

Mr.  Kin  B.  Thompson 
Technical  Director 
Information  Systems  Division 
(0P-91T) 

Office  of  Chief  of  Naval  Operations 
Washington,  D.  C.  20350 
1 copy 

Captain  Richard  L.  Martin,  USN 

Commanding  Officer 

USS  Francis  Marion  (LPA-249) 

FP0  New  York,  N.  Y.  09501 
1 copy 


